System Design Library

Quora / Q&A

Questions, ranked answers, topics, and a personalized feed.

Open the interactive version → diagrams, practice & more

Requirements

Functional

Ask/answer
Upvote ranking
Topics/follow
Personalized feed
Search

Non-functional

Read-heavy
Relevant ranking

Scale

Billions of views

The approach

Questions/answers in a sharded store; answer ranking by quality signals (votes, author, dwell); topic graph for following; feed combines follows + recommended; full-text search.

Key components

Q&A store · ranking · topic graph · feed · search index

Numbers that matter

Quora serves ~300 million monthly visitors with read-to-write ratios estimated at 100:1 or higher — most users only read.
A popular question can accumulate thousands of answers — ranking them requires scoring each one, so O(n) scoring per page view without caching is prohibitive.
Topic follow graphs can reach millions of followers per popular topic — fan-out for a new top answer to "Machine Learning" is enormous.
Answer quality ML models typically use 20–50 features (vote velocity, author credentials, text quality signals) and must score in <10ms per answer.

Senior deep-dive

Answer quality ranking is the product — anyone can write an answer; the hard problem is surfacing the best one (by expertise, votes, credibility) and ordering them correctly per user.

The topic graph is the core data structure: questions are tagged to topics, users follow topics, the feed is essentially a topic-interest graph traversal filtered by follow state and answer quality.

Full-text search and feed are read-dominated and can be aggressively cached — the write surface (new answers, votes) is small relative to reads, which inverts the usual scaling concern.

Topic graph: the interest routing layer

Every question is tagged with one or more topics (e.g., "Machine Learning," "Python"). Users follow topics, not just other users. The feed is computed as: union of questions in followed topics + followed users' answers, ranked by quality × recency × personal relevance. The topic graph is stored as an adjacency list (user_id → [topic_id]) and read at feed-generation time. Topic hierarchy ("Machine Learning" is a subtopic of "Artificial Intelligence") lets the system recommend parent-topic content when subtopic content is sparse.

Answer ranking: multi-signal quality scoring

The displayed order of answers is not chronological — it's by a quality score computed from upvotes (with vote velocity decay), author expertise in the topic (derived from their follower count specifically within that topic), text length/quality signals, and engagement rate (upvote / view ratio). This score is precomputed and cached per answer and re-evaluated when significant new votes arrive (event-driven re-score, not continuous). The top answer monopolizes attention — getting rank-1 right matters more than getting rank-5 right.

Feed generation: fan-out on read vs. write

Quora's feed is generated with fan-out on read (pull model): at feed load, the system fetches the user's followed topics, queries for recent top-quality answers in each, merges and re-ranks, returns a page. This avoids the write amplification of pre-building a timeline for 300M users. The tradeoff: feed latency is higher (several backend calls vs. one timeline read) and more compute-intensive per request. Caching the merged feed for a few minutes per user softens this significantly — most users don't notice 2-minute staleness.

Expert routing: seeding cold questions

A new question with no answers is invisible to the feed and search ranker. Quora's Ask-to-Answer (A2A) feature routes the question directly to topic experts via notification — the system selects candidates with high expertise scores in the question's topics who have historically answered similar questions. This is a notification targeting problem: too broad and experts are spammed, too narrow and the question dies. The algorithm uses a response rate signal (did this expert answer last time they were A2A'd?) to avoid re-notifying chronic non-responders.

Search: Elasticsearch with answer-quality boosting

Question search is a standard inverted-index full-text search over question titles and body text, but boosted by answer quality scores — a question with a top-voted expert answer ranks higher than one with no answers for the same keyword. This requires denormalizing the best-answer score into the question document in Elasticsearch. Deduplication is a significant challenge: Quora has thousands of questions asking essentially the same thing ("What is blockchain?") — they use a question merge feature backed by an ML similarity model to canonicalize duplicates and redirect traffic to the canonical question.

What breaks at scale

Hot questions (a viral news event drives millions of simultaneous readers to one question) create a read hotspot — mitigated by CDN caching the question page (but this requires invalidation when a new top answer is posted). Vote manipulation (coordinated upvote brigades) corrupts the quality ranker — Quora uses vote velocity anomaly detection and account age/credibility weighting to discount suspicious vote bursts. The subtlest long-term failure is answer rot: high-ranked answers from 2015 about Python 2 now show above 2023 Python 3 answers because they accumulated votes over years — recency × quality weighting must be calibrated carefully or the best historical answer permanently outranks the best current answer.

In production

Quora uses MySQL (sharded) as the primary data store, Elasticsearch for search, and a mix of in-house ML models for ranking. Their A2A (Answer to Answer) ranking model is trained on user engagement signals (upvotes, shares, reads-to-completion) and author credential signals (topic expertise score based on follower count in that topic). The real challenge is cold-start on new questions: a question with zero answers and zero votes has no signal for the feed or search ranker, so Quora routes new questions to known experts in the topic via notifications — seeding the first answer is a product-critical editorial problem disguised as an engineering one.

Common mistakes

Ranking by raw votes only
Per-request feed assembly
No search index over content

Related System Design Library

Part of System Design Library on SystemLore — system design interview prep with 148 deep topics, interactive diagrams, and a practice game. Practice this one →