Where to Split: Finding Service Boundaries in a Monolith
The hardest part of monolith decomposition isn't extracting the service. It's figuring out where to cut.
Every engineering leader I've seen tackle this problem starts the same way: someone draws boxes on a whiteboard, labels them by business domain, and declares the service boundaries. Six months later, the team is still untangling shared utilities, circular dependencies, and database tables that three "separate" domains quietly depend on.
The whiteboard lied. The code tells a different story.
Why Boundaries Are Hard to See
Monoliths are honest systems. They accumulate real coupling over years because in-process calls are free. No serialization cost, no network latency, no API contracts. So developers take shortcuts that make perfect sense at the time:
Shared utility classes. That DateUtils or StringHelper started as a convenience. Now 40 packages import it, and half of those imports carry transitive dependencies on domain logic that has nothing to do with dates or strings.
Database coupling. Two "separate" domains share a table. Not because they should, but because someone needed one column from another domain's table three years ago and a JOIN was easier than an API. Now the schema is a shared contract neither team controls.
Event bus entanglement. Domain A publishes an event. Domains B, C, and D consume it. But D only cares about one field that A added as an afterthought. Remove that field and D breaks silently, because there's no compile time contract on event payloads.
Configuration coupling. Shared config files, shared environment variables, shared feature flags. Change one, break three services you didn't know existed.
These aren't design failures. They're the natural consequence of building fast in a monolith. The coupling is real, and you can't wish it away with a whiteboard.
What Actually Works
After watching dozens of modernization efforts (and living through a few), here's what separates the teams that succeed from the ones that stall.
1. Start from the code, not the org chart
The most common mistake: drawing service boundaries along team lines. "The payments team owns payments, so payments is a service."
Teams are organizational constructs. Service boundaries are technical constructs. They overlap sometimes. Often they don't. The payments team might own code that's deeply coupled to the order domain, the user domain, and the notification domain. Extracting "payments" as defined by the team would rip out half the monolith.
Instead, look at the code. Which packages call each other? Which classes share state? Where are the natural seams where coupling is minimal? The code graph tells you where the boundaries actually are, not where you wish they were. Sam Newman puts it well: model bounded contexts as services first, and only then consider breaking them into smaller units around aggregate boundaries. Start from the domain in the code, not the boxes on the org chart.
2. Map dependencies before drawing lines
Before you decide what to extract, you need to know what's connected to what. This means:
Import analysis. Which packages depend on which? Build a directed graph. Look for clusters with dense internal connections and sparse external ones. Those clusters are your candidate domains.
Database access patterns. Which code paths touch which tables? If two packages both read and write to the same table, they're coupled at the data layer regardless of how clean their code interfaces look.
Call graph analysis. Trace the actual execution paths. A function in the "order" domain that calls into "inventory," then "pricing," then "shipping" isn't a clean extraction candidate until those dependencies are resolved.
Most teams skip this step because it's tedious. They eyeball the package structure and assume it reflects the real boundaries. It almost never does. Newman also suggests looking at volatility as an overlay: parts of the codebase that change more frequently than others are strong candidates for extraction, because isolating them reduces coordination overhead across the whole team.
3. Optimize for minimal cross-cutting
The best service boundary is the one that cuts the fewest connections. Not the one that matches a business domain on paper.
Think of it as graph partitioning. You have a graph of code dependencies. You want to partition it into subgraphs where:
- Internal cohesion is high (lots of connections within each subgraph)
- External coupling is low (few connections between subgraphs)
- Each subgraph has clear, narrow interfaces to the rest
This is a measurable property. You can count the number of cross-boundary calls, shared data structures, and transitive dependencies. The partition with the lowest cross-boundary coupling is your best candidate.
4. Extract the boring domain first
Everyone wants to start with the core domain. The order processing engine. The payment pipeline. The recommendation system.
Don't.
Start with the domain that's (a) relatively isolated, (b) non-critical to revenue, and (c) small enough to extract in weeks, not months. The "notification" domain. The "audit log." The "user preferences" service.
Why? Because the first extraction teaches you everything about your deployment pipeline, your testing gaps, your data migration strategy, and your team's readiness. You want to learn those lessons on a domain where mistakes don't page you at 3 AM. Martin Fowler calls this the Strangler Fig pattern: gradually replace pieces of the old system rather than attempting a big bang rewrite.
5. Expect 3x more complexity than you see
This is the non-obvious truth about modernization: what looks like a clean boundary from the outside is tangled underneath. Shared utilities, transitive dependencies, implicit coupling through database tables, event buses, and configuration.
Every modernization attempt reveals more complexity than expected. This is why modernization projects take years instead of months, and why 79% of them fail. Not because the teams are bad. Because the problem is genuinely hard to see until you're inside it. Even among teams that complete a migration, 90% still batch deploy like a monolith, negating the main architectural benefit they were chasing.
It's worth noting that decomposition isn't always the right call. Sam Newman, author of Building Microservices, has said that half of all the clients he's worked with over three years, he told them microservices are not for them. The point isn't to extract everything. It's to extract the right things, at the right time, for the right reasons.
The Case for Automation
Here's what I find interesting about this problem: most of the work is mechanical.
Tracing dependencies? Mechanical. Counting cross-boundary connections? Mechanical. Identifying shared data access patterns? Mechanical. Ranking candidate domains by isolation score? Mechanical.
The hard judgment call is deciding which domain to extract and when. That's a human decision that depends on business priorities, team capacity, and risk tolerance.
But the analysis that informs that decision? That's exactly the kind of work that AI is good at. Feed it the full codebase, let it trace every dependency, map every database access pattern, and score every candidate boundary. An engineer reviews the suggestions and makes the call.
This is what we built at CodeSplit AI. Not a dashboard that shows you your monolith is complex (you already know that). An engine that analyzes your codebase, identifies real boundaries based on actual coupling, and when you pick one, extracts it: new repository, refactored monolith, wired integration, tested and verified.
The analysis is the first step. The extraction is where the real value is.
Getting Started
If you're staring at a monolith and wondering where to begin:
- Don't whiteboard first. Analyze the code first. Let the dependency graph surprise you.
- Quantify coupling. Count cross-boundary connections for every candidate partition. Pick the cleanest cut.
- Start small. Extract something boring and learn from it.
- Automate the mechanical parts. The analysis, the dependency tracing, the extraction itself. Save your engineers for the decisions that actually need human judgment.
We're running early access at codesplit.ai. Connect your repo, see what the AI finds. The boundaries might not be where you think they are.