Academy · Partitioning & Sharding

Why shard

A single database, however big, has a ceiling on storage, writes, and memory.

Open the interactive version → diagrams, practice & more

The problem

A single database, however big, has a ceiling on storage, writes, and memory.

The idea

Split (shard) the data across many databases, each holding a slice, so the whole set scales horizontally.

How it works

A single DB ceilings on writes (one primary), working-set memory, and disk. Sharding splits data across many independent DBs by a shard key, each owning a slice and replicable on its own, so writes and storage scale horizontally. The trigger is usually write throughput or dataset-exceeds-RAM, not read load — replicas and cache solve reads first. A routing layer maps each key to its shard so callers stay oblivious.

The tradeoff

You gain near-limitless scale but lose single-node guarantees: cross-shard joins become scatter-gather (query all shards, merge), multi-shard writes need distributed transactions or sagas, and ops (backup, schema change, rebalancing) multiply. Shard as late as you can but before you must — resharding a live, overloaded system is one of the hardest migrations there is.

In the wild

Instagram, Discord, and most hyperscale apps shard their primary stores.

Interview deep dive

Flow

Confirm the limit is writes/size, not read load (cache first).
Choose a shard key that spreads load and matches queries.
Add a routing layer mapping keys → shards.
Plan cross-shard queries and multi-shard writes separately.

Watch for

Resharding a live overloaded system is brutal — shard before the cliff.
Cross-shard joins degrade to scatter-gather + merge.
Multi-shard writes need a saga or distributed transaction.

Interviewer trap

Name what hit the limit (writes/size) and what you give up — joins and easy transactions.

Related Academy

Part of Academy on SystemLore — system design interview prep with 148 deep topics, interactive diagrams, and a practice game. Practice this one →