Why shard
A single database, however big, has a ceiling on storage, writes, and memory.
Open the interactive version → diagrams, practice & moreThe problem
A single database, however big, has a ceiling on storage, writes, and memory.
The idea
Split (shard) the data across many databases, each holding a slice, so the whole set scales horizontally.
How it works
A single DB ceilings on writes (one primary), working-set memory, and disk. Sharding splits data across many independent DBs by a shard key, each owning a slice and replicable on its own, so writes and storage scale horizontally. The trigger is usually write throughput or dataset-exceeds-RAM, not read load — replicas and cache solve reads first. A routing layer maps each key to its shard so callers stay oblivious.
The tradeoff
You gain near-limitless scale but lose single-node guarantees: cross-shard joins become scatter-gather (query all shards, merge), multi-shard writes need distributed transactions or sagas, and ops (backup, schema change, rebalancing) multiply. Shard as late as you can but before you must — resharding a live, overloaded system is one of the hardest migrations there is.
In the wild
Instagram, Discord, and most hyperscale apps shard their primary stores.
Interview deep dive
Flow
- Confirm the limit is writes/size, not read load (cache first).
- Choose a shard key that spreads load and matches queries.
- Add a routing layer mapping keys → shards.
- Plan cross-shard queries and multi-shard writes separately.
Watch for
- Resharding a live overloaded system is brutal — shard before the cliff.
- Cross-shard joins degrade to scatter-gather + merge.
- Multi-shard writes need a saga or distributed transaction.
Interviewer trap
Name what hit the limit (writes/size) and what you give up — joins and easy transactions.