Tinder / Matching
Show nearby profiles and create matches on mutual likes.
Open the interactive version → diagrams, practice & moreRequirements
Functional
- Geo profile discovery
- Swipe like/pass
- Mutual-match detection
- Chat on match
Non-functional
- Fast recommendations
- Realtime matches
Scale
Millions of swipes/day
The approach
Geo-index profiles (like Uber); a recommendation service ranks candidates; swipes stored; a mutual like triggers a match (check the other side's like) → opens chat.
Key components
Geo index · recsys → swipe store → match detector → chat
Numbers that matter
- Tinder processes ~3 billion swipes per day globally across its user base.
- A geohash at precision 6 covers roughly 1.2 km × 0.6 km — a useful cell size for "nearby" candidate retrieval.
- The swipe decision store (like/pass per pair) must handle ~35K writes/sec at peak globally.
- A typical candidate pool for one user is ~500–2,000 profiles after geo + eligibility filtering, then ranked down to a deck of ~100.
Senior deep-dive
The matching algorithm is a state check, not a search — when Alice likes Bob, you check if Bob already liked Alice; a match is two intersecting sets, not a query.
Geo-indexing dominates the read path: candidate generation is fundamentally "who is nearby and eligible," solved with a geohash or S2-cell spatial index refreshed as users move.
Swipe volume is enormous (billions/day) but most writes are fire-and-forget — the consistency requirement is only on the mutual-like detection, everything else can be eventually consistent.
Geo-index: the candidate generation layer
Users' last-known locations are stored in a Redis sorted set keyed by geohash or S2-cell score, updated on every app open (or background refresh). Candidate generation is a radius query — pull all user IDs within N km — then filter by age, gender preferences, and prior swipes. The geo index must tolerate users going offline for hours; a TTL on location entries prevents ghost profiles from appearing in decks.
Swipe storage: wide-column, write-heavy
Every swipe (like, pass, superlike) is written to Cassandra with a partition key of (swiper_id) and column = swipee_id, value = decision. This gives O(1) writes and O(1) "did I already swipe this person?" reads. The alternative — a relational table with a composite PK — dies at billions of rows under concurrent writes. Bloom filters per user pre-screen "have I swiped this person" before the Cassandra read to save RTTs on common passes.
Match detection: the mutual-like check
When Alice likes Bob, the system reads Bob's swipe record for Alice. If Bob already liked Alice, a match is created and both get a push notification. This is the one place where consistency matters — a double-read race could create duplicate matches. A Compare-And-Swap or short-lived distributed lock on the pair (sorted: min(A,B), max(A,B)) prevents the race without serializing all match creation. Matches land in a separate matches table and open a chat channel.
Deck ranking: the ELO-style desirability system
Candidates aren't shown in random geo order — each profile has an ELO-like score updated by who likes/passes them and the score of the liker (high-ELO likes boost your score more). The deck served to a user sorts candidates by a combined ranking of score × predicted mutual-match probability × recency. This creates a rich-get-richer dynamic: high-score users see other high-score users first, driving faster matches but narrowing their effective pool.
Push notifications: match + message delivery
Match events go to the notification service (APNs/FCM) immediately — this is the dopamine moment and latency matters. In-app chat messages go over WebSocket when the user is active; APNs/FCM as fallback when backgrounded. Per-user WS sessions are pinned to a gateway node; a message to an offline user writes to Cassandra and delivers via push. Chat history is stored in a Cassandra time-series (partition = match_id, clustering = message_timestamp).
What breaks at scale
Location staleness is the primary UX failure mode — a user who moved cities yesterday still appears in the old city's geo index until they open the app, flooding decks with irrelevant profiles. Swipe farming bots flood the like ledger with programmatic swipes, poisoning ELO scores and exhausting real users' daily like quotas. The deepest scaling trap is celebrity profiles: a single high-follower account can accumulate millions of likes-waiting-to-be-read, making the mutual-like check fan out into millions of pending records that must be resolved efficiently — a simple Cassandra read becomes a scan.
In production
Tinder uses Redis (sorted sets by geoscore) for live location indexing and Cassandra for the swipe ledger (wide rows keyed by user, columns = liked user IDs). The real challenge is the Superlike and algorithmic deck problem: the deck shown to a user isn't random — it's ranked by an ELO-like desirability score plus ML signals, and that ranking must update as profiles change, meaning the candidate pipeline is a mini recommendation system, not just a geo lookup. At scale, cold-starting a new user (no ELO, no photo engagement history) without showing them a degraded experience is unsolved in any elegant way.
Common mistakes
- Scanning all profiles for nearby
- Missing the reverse-like check
- No dedup of already-seen profiles