System Design Library

Google Maps

Render maps and compute fast routes over a planet-scale road graph.

Open the interactive version → diagrams, practice & more

Requirements

Functional

  • Map tiles
  • Routing (shortest/fastest path)
  • ETA with traffic
  • Search places

Non-functional

  • Fast routing
  • Fresh traffic

Scale

Planet-scale road graph

The approach

Pre-rendered map tiles served via CDN; the road network as a graph; routing uses precomputed contraction hierarchies (not raw Dijkstra) for millisecond queries; live traffic adjusts edge weights.

Key components

Tile CDN · road graph · routing engine (contraction hierarchies) · traffic feed

Numbers that matter

Senior deep-dive

Map tiles and routing are completely separate systems — tiles are static pre-rendered images served from CDN; routing is a live graph query that must finish in milliseconds using precomputed contraction hierarchies, not Dijkstra.

Contraction hierarchies (CH) are the routing breakthrough — by precomputing shortcut edges for high-importance nodes (highways, interchanges), CH reduces a cross-country query from billions of relaxations to thousands, enabling <100ms routing on a planet-scale graph.

Live traffic integration invalidates precomputed shortcuts — the clever trick is to keep the CH structure fixed and overlay traffic weights at query time as a cost multiplier, rather than recomputing shortcuts continuously.

Tile rendering: separate from routing entirely

Map tiles are pre-rendered offline by a rendering pipeline that reads geospatial data (roads, POIs, land cover) and produces PNG or vector tiles per zoom level per geographic tile. Tiles are immutable and versioned — a road edit produces a new tile version, not an in-place update. CDN edge caches serve 99%+ of tile traffic; the origin only handles cache misses and new tile versions. Vector tiles (used in the modern client) ship raw geometry; the client renders them locally, enabling smooth zooming and rotation.

Contraction hierarchies: the routing trick no one explains

CH works by iteratively contracting less-important nodes (local roads) and adding shortcut edges between their neighbors so the graph can be searched bidirectionally from source and destination using only high-importance nodes in the middle. The importance ordering (highway > arterial > local) is precomputed offline. At query time, the search explores upward in importance from source, downward from destination, and meets in the middle — touching only thousands of nodes on a graph with billions of edges.

Traffic integration: overlaying speeds on fixed shortcuts

Recomputing CH shortcuts every time traffic changes is infeasible. Instead, edge weights (travel times) are updated from live probe data, but the shortcut graph topology stays fixed. This approximation is slightly suboptimal when traffic changes which route is fastest, but the error is small and the latency saving is enormous. Customizable CH (CCH) is a research variant that allows metric changes without re-ordering; some production systems use a form of this.

ETA: a separate ML problem on top of routing

The route distance is the output of graph search; the ETA is a prediction, not a calculation. A model takes the route path, current traffic, historical speeds by time-of-day/day-of-week, and turn penalty models (left turns across traffic cost more than right) to predict arrival time. Historical accuracy feedback (did the user arrive when we predicted?) closes the loop. ETA for congested urban areas is substantially harder than highway routing because turn costs dominate.

Geo-indexing for search and POI lookup

POI search ("coffee near me") uses a geospatial index (S2 cells — Google's own hierarchical sphere decomposition) to bound the search to a geographic region, then applies text and attribute filters within that region. S2 cells cover the sphere with a family of nested quadrilateral cells, avoiding the distortion issues of geohash at poles. The challenge is ranking: proximity is only one signal; quality, popularity, and relevance to the query all feed a ranking model.

What breaks at scale

Mass rerouting events — a major accident triggers millions of clients to simultaneously request new routes, creating a spike that can overwhelm the routing fleet; geographic sharding of the routing graph helps but doesn't eliminate cross-shard queries for long routes. Tile invalidation cascades from a large-area data update (city boundary correction, new highway) can simultaneously expire millions of CDN-cached tiles, causing a cold-cache thundering herd on the tile origin. GPS probe data quality degrades in dense urban canyons, leading to traffic speed misestimates that produce confidently wrong ETAs.

In production

Google Maps uses Protocol Buffers for compact tile encoding and a proprietary vector tile format that lets the client render labels at any rotation without server round-trips. The routing backend runs on custom distributed graph processing infrastructure — not a general-purpose graph DB. The hardest operational problem is map freshness: roads change, businesses open/close, and the pipeline from OSM/partner data to live tiles involves validation, conflict resolution, and human review; a street closure that took a week to propagate was a real user trust crisis. ETA prediction is a separate ML model layered on top of routing, incorporating historical speeds, real-time traffic, and turn penalty data.

Common mistakes

Related System Design Library

Part of System Design Library on SystemLore — system design interview prep with 148 deep topics, interactive diagrams, and a practice game. Practice this one →