Twitter / News Feed
Deliver each user a timeline of recent posts from people they follow — instantly, at 300M users.
Open the interactive version → diagrams, practice & moreRequirements
Functional
- Post
- Follow
- Home timeline
- Likes/retweets
- Media
Non-functional
- Timeline <200ms
- Eventual consistency OK
- Read-heavy
Scale
300M users, celebrity fan-out
The approach
Precompute timelines via fan-out-on-write into per-user cached lists for fast reads; for mega-accounts use fan-out-on-read (hybrid) to avoid 100M writes per tweet.
Key components
LB → app → timeline cache · post → queue → fan-out workers · media → object store + CDN · search
Numbers that matter
- Read:write is extreme — a tweet from a 100M-follower account would mean 100M timeline writes; that is exactly why mega-accounts switch to fan-out-on-read.
- Timeline budget <200ms — so home timelines are precomputed lists of tweet IDs in cache, never built by joining the follow graph at read time.
- Store IDs, not tweets, in timelines — a timeline is ~800 tweet IDs; hydrate bodies from a separate cache so one edit or delete doesn't rewrite millions of copies.
- Eventual consistency is fine — a tweet showing up a second late is acceptable, which is what makes async fan-out workers viable.
Senior deep-dive
The celebrity problem is the whole interview — there is no single fan-out strategy that works for everyone.
Fan-out-on-write (push each tweet into followers' cached timelines) makes reads instant but explodes on write for huge followings; fan-out-on-read (pull at request time) is cheap to write but slow to read.
The senior answer is a hybrid keyed on follower count — push for normal users, pull-and-merge for mega-accounts.
Fan-out-on-write: fast reads, brutal writes
On every post, push the tweet ID into each follower's cached timeline so reads are an O(1) cache fetch. Great for the median user — but a celebrity post triggers millions of writes, saturating the fan-out workers and delaying everyone. This is the default that breaks.
Fan-out-on-read: cheap writes, slow reads
Store nothing on write; at read time pull recent tweets from everyone you follow and merge. Trivial to write, but a read now fans out across hundreds of accounts — too slow for the common case. Useful precisely for the accounts that make fan-out-on-write explode.
The hybrid is the senior answer
Push for normal accounts; pull-and-merge for mega-accounts above a follower threshold. A user's timeline = their precomputed list merged with a fresh pull from the few celebrities they follow. It bounds both write amplification and read latency — name the threshold and the merge and you've answered the question.
Timelines store IDs; bodies live in cache
A timeline is a capped list of tweet IDs (~800), not full tweets. Hydrate bodies from a separate tweet cache so edits, deletes, and counts update in one place — and each user's timeline stays tiny and cheap to fan out.
Sharding and the graph
Tweets shard by ID (or author); the follow graph is its own wide-column store, denormalized both directions so "who follows X" is fast for fan-out. Media never lives in the tweet store — it is object storage + CDN, referenced by URL.
What breaks at scale
The pain is celebrity write amplification, timeline cache memory (300M users × 800 IDs), and hot-key reads on viral tweets. Cap timeline length, hybrid-fan-out the long tail, cache hot tweets at the edge. Ranking (relevance vs reverse-chron) adds a scoring layer but doesn't change the fan-out spine.
In production
Twitter's real architecture is this hybrid: most users get fan-out-on-write into a Redis timeline; a few thousand celebrities are merged in at read time. Instagram and TikTok feeds use the same push/pull split. At its core this is a caching-and-fan-out problem — the social graph and ranking sit on top.
Common mistakes
- Pure fan-out-on-write for everyone
- Joining followers at read time with no cache
- Storing media blobs in the tweet store