Academy · Partitioning & Sharding

Choosing a shard key

A bad shard key concentrates load on one shard and ruins the whole point.

Open the interactive version → diagrams, practice & more

The problem

A bad shard key concentrates load on one shard and ruins the whole point.

The idea

Pick a key that spreads data and load evenly and matches your query patterns.

How it works

Good keys have high cardinality and even access (e.g. user_id). Avoid keys that create hotspots (e.g. "country" when one country dominates, or a timestamp that funnels all new writes to one shard).

The tradeoff

Optimizing for even distribution can hurt locality (related data spread across shards → cross-shard queries).

In the wild

Sharding tweets by user_id keeps a user's data together but makes "global trending" a cross-shard problem.

Interview deep dive

Flow

  1. List the top read and write queries before choosing the key.
  2. Pick a high-cardinality key that sends common operations to one shard.
  3. Add a routing layer so callers do not know shard locations.
  4. Plan the cross-shard path separately for global queries.

Watch for

Interviewer trap

Say which query becomes harder after sharding; that shows you understand the bill.

Related Academy

Part of Academy on SystemLore — system design interview prep with 148 deep topics, interactive diagrams, and a practice game. Practice this one →