Academy · Partitioning & Sharding
Choosing a shard key
A bad shard key concentrates load on one shard and ruins the whole point.
Open the interactive version → diagrams, practice & moreThe problem
A bad shard key concentrates load on one shard and ruins the whole point.
The idea
Pick a key that spreads data and load evenly and matches your query patterns.
How it works
Good keys have high cardinality and even access (e.g. user_id). Avoid keys that create hotspots (e.g. "country" when one country dominates, or a timestamp that funnels all new writes to one shard).
The tradeoff
Optimizing for even distribution can hurt locality (related data spread across shards → cross-shard queries).
In the wild
Sharding tweets by user_id keeps a user's data together but makes "global trending" a cross-shard problem.
Interview deep dive
Flow
- List the top read and write queries before choosing the key.
- Pick a high-cardinality key that sends common operations to one shard.
- Add a routing layer so callers do not know shard locations.
- Plan the cross-shard path separately for global queries.
Watch for
- Timestamp keys create a moving hot shard.
- Country or status keys look tidy but split load poorly.
- Even data distribution can still produce uneven traffic.
Interviewer trap
Say which query becomes harder after sharding; that shows you understand the bill.
Related Academy
Part of Academy on SystemLore — system design interview prep with 148 deep topics, interactive diagrams, and a practice game. Practice this one →