2PC, 3PC & the blocking problem
Make N machines commit a transaction all-or-nothing, when any of them can crash mid-handshake.
Open the interactive version → diagrams, practice & moreThe problem
Make N machines commit a transaction all-or-nothing, when any of them can crash mid-handshake.
The idea
Two-Phase Commit: a coordinator asks everyone to "prepare", then tells all to commit only if all voted yes.
How it works
Phase 1 (prepare): participants durably log and vote. Phase 2 (commit/abort): coordinator decides. A YES vote is a binding promise. Its fatal flaw: if the coordinator dies after votes, participants block forever holding locks. 3PC adds a phase to avoid blocking but breaks under network partitions (split brain).
The tradeoff
2PC chooses safety over liveness (can block). 3PC chose liveness but risks inconsistency. Modern systems run 2PC over a consensus group instead.
In the wild
Google Spanner runs 2PC on top of Paxos groups so the coordinator can't be a single point of failure.
Interview deep dive
Flow
- Coordinator sends prepare; participants durably log and vote.
- All vote yes → coordinator logs commit and broadcasts it.
- Any no or timeout → global abort.
- Participants apply the decision and release locks.
Watch for
- A YES vote is a binding promise — the participant must be able to commit.
- Coordinator dies post-vote → participants block, holding locks.
- 3PC unblocks but breaks under network partitions (split brain).
Interviewer trap
Name the blocking flaw, then say modern systems run 2PC over a Paxos/Raft group.