Replication & read replicas
One database serving every read will eventually drown under read traffic.
Open the interactive version → diagrams, practice & moreThe problem
One database serving every read will eventually drown under read traffic.
The idea
Keep copies (replicas) of the database; serve reads from replicas, writes from the primary.
How it works
The primary streams its change log to replicas; writes go to the primary, reads spread across replicas to multiply read capacity. Replication can be async (fast, but a lagging replica loses recent writes on failover), sync (no loss, but a write waits for replicas — higher latency, lower availability if one stalls), or semi-sync (wait for one). Replicas also enable failover: promote a standby when the primary dies — but two nodes both believing they're primary is split-brain, so promotion needs a fencing/consensus step.
The tradeoff
Async is the common default but creates the read-after-write problem: a user may not see their own just-written data on a lagging replica. Fixes — read your own writes from the primary, sticky-route to one replica, or track a version for monotonic reads. And replicas only scale reads: the single primary still bounds write throughput, which is what eventually pushes you to sharding.
In the wild
Nearly every read-heavy app runs a primary + several read replicas.
Interview deep dive
Flow
- Send writes to the primary; fan reads across replicas.
- Pick async/semi-sync/sync by your data-loss tolerance.
- Route read-after-write traffic to the primary or a tracked replica.
- On primary failure, fence the old one and promote a standby.
Watch for
- Async lag → stale reads and read-after-write surprises.
- Failover without fencing causes split-brain (two primaries).
- Replicas scale reads only; writes still bottleneck on the primary.
Interviewer trap
Name the replication mode by data-loss tolerance and give your read-after-write fix.