Google Maps
Render maps and compute fast routes over a planet-scale road graph.
Open the interactive version → diagrams, practice & moreRequirements
Functional
- Map tiles
- Routing (shortest/fastest path)
- ETA with traffic
- Search places
Non-functional
- Fast routing
- Fresh traffic
Scale
Planet-scale road graph
The approach
Pre-rendered map tiles served via CDN; the road network as a graph; routing uses precomputed contraction hierarchies (not raw Dijkstra) for millisecond queries; live traffic adjusts edge weights.
Key components
Tile CDN · road graph · routing engine (contraction hierarchies) · traffic feed
Numbers that matter
- Google Maps pre-renders map tiles at 21 zoom levels; a single city at all zoom levels produces millions of tiles, all cached at CDN edges worldwide.
- The road network graph for routing contains roughly 100 million nodes and 1 billion edges globally after preprocessing; contraction hierarchies reduce the effective search space by orders of magnitude.
- Contraction hierarchy routing can answer a cross-continent route in <100ms on preprocessed data; raw Dijkstra on the same graph would take minutes.
- Live traffic speeds are updated at roughly 1-5 minute intervals from aggregated probe data (GPS pings from Android devices); data from hundreds of millions of devices feeds the traffic model.
Senior deep-dive
Map tiles and routing are completely separate systems — tiles are static pre-rendered images served from CDN; routing is a live graph query that must finish in milliseconds using precomputed contraction hierarchies, not Dijkstra.
Contraction hierarchies (CH) are the routing breakthrough — by precomputing shortcut edges for high-importance nodes (highways, interchanges), CH reduces a cross-country query from billions of relaxations to thousands, enabling <100ms routing on a planet-scale graph.
Live traffic integration invalidates precomputed shortcuts — the clever trick is to keep the CH structure fixed and overlay traffic weights at query time as a cost multiplier, rather than recomputing shortcuts continuously.
Tile rendering: separate from routing entirely
Map tiles are pre-rendered offline by a rendering pipeline that reads geospatial data (roads, POIs, land cover) and produces PNG or vector tiles per zoom level per geographic tile. Tiles are immutable and versioned — a road edit produces a new tile version, not an in-place update. CDN edge caches serve 99%+ of tile traffic; the origin only handles cache misses and new tile versions. Vector tiles (used in the modern client) ship raw geometry; the client renders them locally, enabling smooth zooming and rotation.
Contraction hierarchies: the routing trick no one explains
CH works by iteratively contracting less-important nodes (local roads) and adding shortcut edges between their neighbors so the graph can be searched bidirectionally from source and destination using only high-importance nodes in the middle. The importance ordering (highway > arterial > local) is precomputed offline. At query time, the search explores upward in importance from source, downward from destination, and meets in the middle — touching only thousands of nodes on a graph with billions of edges.
Traffic integration: overlaying speeds on fixed shortcuts
Recomputing CH shortcuts every time traffic changes is infeasible. Instead, edge weights (travel times) are updated from live probe data, but the shortcut graph topology stays fixed. This approximation is slightly suboptimal when traffic changes which route is fastest, but the error is small and the latency saving is enormous. Customizable CH (CCH) is a research variant that allows metric changes without re-ordering; some production systems use a form of this.
ETA: a separate ML problem on top of routing
The route distance is the output of graph search; the ETA is a prediction, not a calculation. A model takes the route path, current traffic, historical speeds by time-of-day/day-of-week, and turn penalty models (left turns across traffic cost more than right) to predict arrival time. Historical accuracy feedback (did the user arrive when we predicted?) closes the loop. ETA for congested urban areas is substantially harder than highway routing because turn costs dominate.
Geo-indexing for search and POI lookup
POI search ("coffee near me") uses a geospatial index (S2 cells — Google's own hierarchical sphere decomposition) to bound the search to a geographic region, then applies text and attribute filters within that region. S2 cells cover the sphere with a family of nested quadrilateral cells, avoiding the distortion issues of geohash at poles. The challenge is ranking: proximity is only one signal; quality, popularity, and relevance to the query all feed a ranking model.
What breaks at scale
Mass rerouting events — a major accident triggers millions of clients to simultaneously request new routes, creating a spike that can overwhelm the routing fleet; geographic sharding of the routing graph helps but doesn't eliminate cross-shard queries for long routes. Tile invalidation cascades from a large-area data update (city boundary correction, new highway) can simultaneously expire millions of CDN-cached tiles, causing a cold-cache thundering herd on the tile origin. GPS probe data quality degrades in dense urban canyons, leading to traffic speed misestimates that produce confidently wrong ETAs.
In production
Google Maps uses Protocol Buffers for compact tile encoding and a proprietary vector tile format that lets the client render labels at any rotation without server round-trips. The routing backend runs on custom distributed graph processing infrastructure — not a general-purpose graph DB. The hardest operational problem is map freshness: roads change, businesses open/close, and the pipeline from OSM/partner data to live tiles involves validation, conflict resolution, and human review; a street closure that took a week to propagate was a real user trust crisis. ETA prediction is a separate ML model layered on top of routing, incorporating historical speeds, real-time traffic, and turn penalty data.
Common mistakes
- Plain Dijkstra at query time
- Rendering tiles on the fly
- Ignoring dynamic traffic weights