Consensus: Raft & Paxos
Many machines must agree on one value/log order, surviving crashes and partitions — provably hard (FLP).
Open the interactive version → diagrams, practice & moreThe problem
Many machines must agree on one value/log order, surviving crashes and partitions — provably hard (FLP).
The idea
Consensus algorithms use majority quorums to elect a leader and agree on a replicated log, staying safe through failures.
How it works
Raft (the understandable one): elect a leader; the leader appends entries to a majority before committing; on leader failure a new election runs. Any two majorities overlap, so conflicting decisions are impossible — at most one side of a partition can make progress.
The tradeoff
Strongly consistent and partition-safe, but writes need a round-trip to a majority (latency), and the minority side loses availability.
In the wild
etcd, Consul, CockroachDB, TiDB, Kafka's controller — all built on Raft/Paxos.
Interview deep dive
Flow
- Elect a leader via a majority vote (a term).
- Leader appends each entry to its log and replicates it.
- An entry commits once a majority has durably stored it.
- On leader loss a new election runs; overlapping majorities keep it safe.
Watch for
- Writes need a majority round-trip — a latency floor you can't avoid.
- The minority side of a partition can't make progress (no availability).
- Use odd cluster sizes (3/5/7); even sizes waste a node on quorum.
Interviewer trap
Explain safety from quorum overlap: any two majorities share a node, so no split decisions.