How to Split a Monolith GraphQL Schema into Subgraphs

This guide walks through decomposing a single monolithic GraphQL schema into independently deployable Apollo Federation v2 subgraphs without breaking clients or taking downtime. It is the tactical companion to defining subgraph boundaries for microservices; read that first for how to decide where the lines go, then use this for the extraction sequence. Both sit under GraphQL Federation Architecture & Design.

When to Use This Pattern

Your GraphQL monolith has become a deployment bottleneck: multiple teams contend on one codebase and one release.
Distinct domains have diverged in scaling needs, data stores, or ownership, and you want each to deploy on its own cadence.
You can run the legacy monolith and new subgraphs side by side long enough to shift traffic incrementally — a big-bang cutover is not an option at production scale.

Prerequisites

Apollo Federation v2 toolchain: @apollo/server, @apollo/subgraph, graphql-tag, and the Rover CLI
federation_version: =2.9.0 (or later) pinned for composition
Apollo Router (the Rust router) available to run alongside the monolith
Production query traces to identify the highest-value, lowest-coupling domain to extract first
A clear ownership map: every type assigned to exactly one target subgraph

Implementation Walkthrough

The extraction follows a fixed sequence: audit, draw the boundary, extract with directives, compose, then shift traffic. The diagram shows the five phases and the rollback path that keeps the whole thing safe.

Start by exporting the monolith’s full SDL and tracing resolver execution so you extract from evidence, not guesswork. rover graph introspect <endpoint> --output monolith.graphql gives you the type inventory; tracing tells you which resolvers hit which data sources and where the N+1 hot spots are. Types with single, clear ownership are your first extraction targets; high-coupling nodes such as User.orders and Order.user will need federation directives and careful handling.

The audit is where most extractions are won or lost, so do not rush it. The dependency baseline you build now is what tells you which boundaries are cheap and which are about to bite. For each resolver, record the data source it reads, the other types it traverses, and how often it appears in production traces. Two patterns deserve a flag: utility types referenced from everywhere (a Money scalar, a PageInfo connection type) that must be shareable across every subgraph, and circular references such as User.orders paired with Order.user, which are fine in federation but require both subgraphs to stub the other’s entity. Knowing these before you cut means you write the right @key stubs the first time instead of discovering them as composition errors.

Pick one domain to extract first — typically User or Catalog, because they are referenced widely but depend on little. The goal is to leave a @key stub behind so the referencing types still type-check, while the canonical definition and its __resolveReference move into the new subgraph. The example below splits a User/Order graph: the user subgraph becomes the canonical owner of User, and the order subgraph references User by stub.

# === user subgraph: canonical owner of User ===
extend schema
  @link(url: "https://specs.apollo.dev/federation/v2.9", import: ["@key"])

type User @key(fields: "id") {   # id is the cross-subgraph identifier
  id: ID!
  name: String!
}

type Query {
  user(id: ID!): User
}

# === order subgraph: references User by stub, owns Order ===
extend schema
  @link(url: "https://specs.apollo.dev/federation/v2.9",
        import: ["@key", "@external"])

type Order @key(fields: "id") {
  id: ID!
  total: Float!
  user: User!                    # resolved by following the User key to the user subgraph
}

# Stub only: the order subgraph references User but never resolves its fields
type User @key(fields: "id") {
  id: ID! @external              # @external marks the field as owned elsewhere
}

type Query {
  order(id: ID!): Order
}

// === user subgraph resolver: hydrate User from a key ===
import { buildSubgraphSchema } from '@apollo/subgraph';
import { gql } from 'graphql-tag';

const typeDefs = gql`
  extend schema @link(url: "https://specs.apollo.dev/federation/v2.9", import: ["@key"])
  type User @key(fields: "id") { id: ID! name: String! }
  type Query { user(id: ID!): User }
`;

const resolvers = {
  Query: {
    user: (_: unknown, { id }: { id: string }) => fetchUserById(id),
  },
  User: {
    // The router calls this when the order subgraph references a User by { id }.
    __resolveReference(ref: { id: string }) {
      return fetchUserById(ref.id);   // must return the @key fields, here id + name
    },
  },
};

export const schema = buildSubgraphSchema({ typeDefs, resolvers });

The critical rules: use @key to name the cross-subgraph identifier; use @external on stub fields the subgraph references but does not resolve; reach for @shareable only when two subgraphs legitimately resolve identical field logic; and implement __resolveReference in every entity’s owning subgraph — without it, the router returns null for all cross-subgraph fetches of that type. With the SDL and resolvers in place, declare the topology so the supergraph can be composed:

# supergraph.yaml
federation_version: =2.9.0
subgraphs:
  user-service:
    routing_url: http://user-svc:4001/graphql
    schema:
      file: ./user.graphql
  order-service:
    routing_url: http://order-svc:4002/graphql
    schema:
      file: ./order.graphql

Deploy the router alongside the legacy monolith, gate it behind a readiness probe that blocks traffic until composition succeeds, and use weighted routing at your ingress (for example 10% federated, 90% monolith) so you can roll back instantly.

One subtlety worth calling out: during the transition the monolith and the new subgraph may both define the entity you are extracting. That is expected and safe as long as exactly one of them is the routed owner at any moment. The cleanest way to manage the cutover is to publish the new subgraph, point the router’s routing_url for that domain at the new service, and leave the monolith serving only the paths you have not yet extracted. If you need to move a single field rather than a whole type — for example shifting User.email from the monolith to the new accounts subgraph without breaking clients mid-flight — use @override(from: "monolith") on the field in the new subgraph. Composition then routes that field to the new owner while every other User field continues to resolve from wherever it lives, letting you migrate field by field instead of type by type:

# accounts subgraph: take over User.email from the monolith subgraph
extend schema
  @link(url: "https://specs.apollo.dev/federation/v2.9",
        import: ["@key", "@override"])

type User @key(fields: "id") {
  id: ID!
  email: String! @override(from: "monolith")   # router now resolves email here
}

Verification Steps

First, confirm the subgraphs compose into a valid supergraph and that the change introduces no breaking changes for existing clients:

rover supergraph compose --config supergraph.yaml --output supergraph.graphql
rover subgraph check my-graph@prod --name order-service --schema ./order.graphql

Then verify entity stitching end to end by running a query that crosses the new boundary:

query {
  order(id: "ord_123") {
    id
    total
    user {        # resolved via the order -> user boundary
      id
      name
    }
  }
}

A correct split returns the user nested inside the order, proving the router followed the @key and the user subgraph’s __resolveReference hydrated the entity:

{
  "data": {
    "order": {
      "id": "ord_123",
      "total": 149.99,
      "user": { "id": "usr_456", "name": "Jane Doe" }
    }
  }
}

Once the query resolves correctly, shift traffic incrementally: enable the federated path for a small slice, watch resolver latency and error rates (5xx, GRAPHQL_VALIDATION_FAILED), and ramp to 100% only after stability thresholds hold — for example P99 under 200 ms and error rate under 0.1%. Then decommission the legacy path and enforce rover subgraph check as a merge gate so the boundary cannot silently drift.

Common Mistakes & Gotchas

Forgetting __resolveReference in the owning subgraph. This is the most common extraction failure. Composition passes, but every cross-subgraph fetch of the entity returns null at runtime because the router has no way to hydrate the type from its key. Always implement the reference resolver in the owner and confirm it returns the @key fields.

Mismatched key fields between stub and owner. If the order subgraph stubs User @key(fields: "id") but the user subgraph’s __resolveReference returns an object without id, the join silently breaks. The key shape in the stub, the owner’s @key, and the resolver’s return must agree exactly.

Sharing enums or input types without matching values. Enums used in multiple subgraphs must have identical value sets at composition time — they do not take @shareable, they must simply match. A divergent value list produces a composition error like Enum type "Status" is defined with inconsistent values across subgraphs. Enforce parity with a shared package or a contract test.

Frequently Asked Questions

Can I run the monolith and subgraphs simultaneously during migration?

Yes — that is the recommended path. Route extracted queries to the new subgraphs through the router while the monolith continues to serve unextracted paths. Use weighted or header-based routing to shift traffic incrementally without client disruption, and keep the monolith as an instant rollback target until each domain is fully validated.

Which domain should I extract first?

Extract the highest-value, lowest-coupling domain — usually one that is widely referenced but depends on little, such as User or Catalog. It validates your composition and routing pipeline on a forgiving target before you tackle tightly coupled domains. The reference-counting method for ranking candidates is in defining subgraph boundaries for microservices.

What happens to existing resolvers during extraction?

They migrate to the subgraph that now owns the type. The router intercepts each operation, resolves entity references via __resolveReference, and delegates field resolution to the owning subgraph. Legacy resolvers stay active for unextracted paths until that domain is fully cut over and validated.

Defining Subgraph Boundaries for Microservices — parent guide
GraphQL Federation Architecture & Design — section overview
Designing Cross-Service Type References
Resolving Schema Conflicts in Apollo Federation
Implementing Entity Resolvers with @key Directives — cross-section: the resolver side of the boundary

How to Split a Monolith GraphQL Schema into Subgraphs #

When to Use This Pattern #

Prerequisites #

Implementation Walkthrough #

Verification Steps #

Common Mistakes & Gotchas #

Frequently Asked Questions #

Related #