Design ride dispatch (Uber)
Match riders to nearby drivers in real time as millions of locations update every few seconds.
Open the interactive version → diagrams, practice & moreThe problem
Match riders to nearby drivers in real time as millions of locations update every few seconds.
The idea
Geo-indexed driver locations + fast nearest-neighbor queries + async matching.
How it works
Drivers stream location every few seconds into an ingestion layer (WebSocket/queue) that updates a geospatial index — geohash/H3 cells (or a quadtree) keyed in a fast, possibly-lossy hot store, since you only need the latest position. A ride request queries the rider's cell plus its 8 neighbors (to catch drivers just across a boundary), ranks candidates, and matches via workers. The trip itself is a state machine (requested → matched → started → ended) behind a load-balanced API, with idempotent transitions.
The tradeoff
Location writes dwarf reads and don't need durability, so a lossy in-memory geo-store is the right call — persisting every ping would melt the DB for no benefit. Geo-sharding gives fast local queries but cross-cell edges (an airport, a dense downtown) become hot cells needing finer subdivision or dynamic rebalancing. Matching must be idempotent — a retried request must not double-assign a driver.
In the wild
Uber, Lyft, DoorDash dispatch systems.
Interview deep dive
Flow
- Drivers stream location into a geo-indexed hot store (geohash/H3).
- A ride request queries the rider's cell + 8 neighbors.
- Workers rank and match; the trip becomes a state machine.
- State transitions stay idempotent against retries.
Watch for
- Cell-boundary drivers are missed without querying neighbor cells.
- Location writes are huge and lossy — don't persist every ping.
- Hot cells (airports) need finer subdivision or rebalancing.
Interviewer trap
Name the geo-index AND the neighbor-cell query, plus the lossy hot store for location writes.