Academy · Partitioning & Sharding

Consistent hashing

With naive hashing (key % N), changing the number of nodes reshuffles almost all keys — a disaster.

Open the interactive version → diagrams, practice & more

The problem

With naive hashing (key % N), changing the number of nodes reshuffles almost all keys — a disaster.

The idea

Consistent hashing maps keys and nodes onto a ring so adding/removing a node moves only a small fraction of keys.

How it works

Hash nodes and keys onto a ring; a key is owned by the next node clockwise, so adding/removing a node moves only that node's arc — roughly K/N keys, not all of them. Each physical node maps to many virtual nodes to smooth imbalance and weight bigger machines. Replication falls out naturally: store each key on the next R distinct nodes around the ring. Rendezvous (HRW) hashing is a simpler alternative with similar properties.

The tradeoff

Slightly less perfectly balanced than key%N, but vastly more stable under membership changes — the whole point. Too few virtual nodes and load skews; bounded-load variants cap any node's share so a hot range can't overload its owner. Range partitioning instead keeps keys sortable (good for range scans) but needs explicit splits and risks sequential-write hotspots.

In the wild

Used by Dynamo, Cassandra, and most distributed caches/CDNs.

Interview deep dive

Flow

Hash nodes (as many virtual nodes) onto the ring.
Place each key on the next node clockwise.
On membership change, move only the affected arc (~K/N keys).
Replicate to the next R distinct nodes around the ring.

Watch for

Too few virtual nodes → uneven load distribution.
A single hot key still overloads its owner (use bounded loads).
Range partitioning beats it when you need ordered range scans.

Interviewer trap

Quantify the win: ~K/N keys move on a node change, not nearly all of them.

Related Academy

Part of Academy on SystemLore — system design interview prep with 148 deep topics, interactive diagrams, and a practice game. Practice this one →