👋 Hey {{first_name|there}},

Why this matters / where it hurts

So you migrated to microservices. It felt like real progress at the time: smaller repos, smaller deploys, and teams could finally move independently.

And then reality showed up uninvited.

Deployments started getting slower, not faster. That "small change" you wanted to ship? Turns out it needs three different services, two teams in a Zoom room, and a rollout plan that reads like a military operation. Incidents stopped looking like bugs and started looking like coordination failures. Oh, and latency? It's been creeping up because every request now takes a grand tour of your entire architecture.

For a while, I blamed the tooling. The CI pipeline is slow. Kubernetes is too complex. Our observability setup is a mess. And yeah, sometimes that's actually the problem, but not nearly often enough to explain what I kept seeing.

The real culprit is usually much simpler: you split your system into services, but the system still behaves like one giant unit. Only now all that coupling happens over the network instead of in-process. That's the distributed monolith. And honestly? It's often worse than the original monolith because you kept all the coupling and added retries, timeouts, partial failures, and version drift on top.

We're going to diagnose the coupling with a straightforward cohesion analysis, then merge the chattiest parts back together before you go splitting things up again, this time with a better rule.

🧭 Mindset shift

From: "Microservices are about nouns. Customer Service, Product Service, Order Service."
To: "Microservices are about cohesion. What changes together should run together."

Here's what happens when you split by nouns: you end up slicing a single behavior across multiple services. Checkout becomes Order + Payment + Inventory + Pricing + Promotions. On paper, each one looks clean and focused. In reality, that behavior is now a distributed transaction with network calls at every turn.

Cohesion is a much better lens. Put together the things that change together, deploy together, and fail together. Split where the change rates and operational concerns actually differ.

Two rules that'll keep you honest:

  1. If a single user action requires 4-10 synchronous service calls, you didn't gain modularity. You just relocated it.

  2. If two services have to ship in lockstep to deliver one feature safely, they're not really separate services yet.

🧰 Tool of the week: Service Cohesion Analysis Sheet

Think of this as a one-page decision sheet for any "problem area" flow. You can run through it during a design review or as a retro after a particularly painful release.

Pick one user behavior
Name a single flow, not an entire domain. Something like "Place order," or "Upgrade plan," or "Generate invoice PDF."

Draw the call chain
List out the synchronous hops for the critical path. Count the total hops and note any fan-out. This is your baseline.

Measure chatty coupling
For each service pair in the flow, write down how many calls happen per request and what data gets passed around. If two services are exchanging 3 or more calls per request, flag it.

Check transactional boundaries
Write down where the system actually needs atomicity. If you're relying on "do A, then B, then compensate if something breaks" for core money or state changes, flag it.

Score change-rate alignment
For each service, note how often it changes relative to the others. If two services are changing together most weeks, score them as strongly coupled.

Score ownership coupling
Who owns each service? If delivering one feature requires 2 or more teams to coordinate every single time, score that boundary as weak.

Identify merge candidates
Pick the 1-2 highest pain boundaries. These are usually the chatty pairs that share atomicity concerns and have a shared change rate.

Decide on the corrective move
Choose one of these:

  • Merge services into one deployable unit

  • Keep services separate, but change the interaction to async events

  • Introduce a facade that owns the behavior and hides the internal calls

State your expected outcome in one line, like "reduce checkout hop count from 7 to 3."

Define success signals
Pick 3 metrics you'll watch for two weeks. Things like: deployment steps reduced, p95 latency improved, incident rate down, fewer coordinated releases, fewer cross-service rollbacks.

Add a guardrail
Write one rule that prevents you from recreating the distributed monolith. For example: "No new synchronous calls in checkout without a hop budget review."

🔍 Example: Checkout split by nouns

Scope:
The behavior we're looking at is "Place order."

Context/architecture:
We've got these services: Customer, Cart, Pricing, Promotions, Inventory, Order, Payment. The UI calls an API gateway that fans out to all of them.

Step-by-step using the sheet:

Call chain: gateway → Cart → Pricing → Promotions → Inventory → Order → Payment → Order. That's 7 hops, with a loop back at the end.

Chatty coupling: Pricing calls, Promotions twice. Promotions call the Catalog. Order calls both Inventory and Payment, then re-reads its own state. Multiple service pairs are exchanging several calls per request.

Transaction boundaries: "Place order" really needs atomicity around inventory reservation, the payment intent, and order state. We have compensations in place, but they're brittle when retries get involved.

Change-rate alignment: Pricing, Promotions, and Cart all change together every sprint because product experiments touch all three at once.

Ownership coupling: The Checkout team owns Cart. Pricing belongs to a different team. Promotions is owned by a third team. Releases require constant coordination between all three.

Merge candidates: Cart + Pricing + Promotions form one cohesive unit for the checkout behavior. Order + Inventory reservation is another natural grouping.

Corrective move: Merge Cart, Pricing, and Promotions into a single "Checkout" service that owns the whole behavior and produces an OrderDraft. Keep Inventory and Payment as external dependencies with clear, stable contracts.

Expected outcome: Reduce hop count from 7 to 3. Reduce coordinated releases from 3 teams down to 1 for checkout experiments.

Success signals: p95 checkout API latency, number of services touched per feature, rollback frequency, and incidents tagged with "cross-service coordination."

Guardrail: The hop budget for checkout is 3. Any new synchronous dependency needs a review.

What success looks like:
Checkout can deploy independently again. Most changes ship with one service and one pipeline. Failures isolate better because the behavior has a single, accountable owner.

Small confession:
Merging services feels like going backwards. It's not. It's paying down the debt from a bad split so you can split again later, this time without lying to yourself about the boundaries.

Do this / avoid this

Do:

  • Split by behavior and change rate, not by nouns

  • Budget synchronous hops per critical flow

  • Merge chatty, lockstep services back into one deployable unit

  • Use async events where you don't need immediate consistency

  • Align ownership to the boundary, one behavior, one accountable team

Avoid:

  • Using "Customer Service, Product Service" as your default template

  • Distributed transactions for core flows without a clear compensating strategy

  • Adding services just to reduce code size while increasing coordination costs

  • Treating network calls as if they were in-process function calls

  • Shipping features that require 3 services to change in the same week

🧪 Mini challenge

Goal: Identify one merge candidate in 45 minutes.

  1. Pick one painful flow, the one that makes releases feel slow

  2. Write out the synchronous hop chain and count the hops

  3. Flag any service pair with 3 or more calls per request

  4. Ask yourself one question: do these services change together most weeks?

  5. Pick one boundary to fix and choose your move (merge, async, or facade)

  6. Write down one success metric and one guardrail rule

Hit reply and tell me the flow and the hop count. One sentence is enough.

🎯 Action step for this week

  • Choose the top two flows that cause coordinated releases

  • Run a Service Cohesion Analysis Sheet for each one with the owning teams in the room

  • Decide on one corrective move and put it on the roadmap with an owner and a date

  • Add a hop budget review to your design review process for critical paths

  • Track one simple metric: "services touched per feature" for your top product area

By the end of this week, aim to have one merge candidate approved and scheduled, with success signals clearly defined.

👋 Wrapping up

Noun-based splits often just create chatty coupling over the network.

High hop counts usually mean you moved the monolith around instead of actually removing it.

Merge first when services change and fail together. Then you can split again with a better rule.

Measure success by fewer coordinated releases and stable p95 latency, not by how many services you have.

⭐ Most read issues (good place to start)

If you’re new here, these are the five issues readers keep coming back to:

Hit reply and tell me your biggest challenge with microservices.

Happy New Year! 🎉

As we close out 2025, I want to say thank you for being here and for letting me share these lessons with you. I hope you're spending these last days of the year with the people who matter most, your family, your friends, the ones who remind you there's more to life than deployment pipelines and service meshes.

Here's to 2026. A year for building systems that scale, yes, but also for building businesses that grow, teams that thrive, and making the kind of impact that actually matters. Whether you're preparing to scale up, level up, or just ship something you're proud of, I'm excited to be on this journey with you.

Take care of yourself and the people you love. I'll see you in the new year with more practical lessons and hopefully a few wins to celebrate together.

Cheers to what's ahead,
Bogdan Colța
Tech Architect Insights

Keep Reading