Design a chat app
Real-time 1:1 and group messaging with ordering and offline delivery, scaling to millions of live connections.
Open the interactive version → diagrams, practice & moreThe problem
Real-time 1:1 and group messaging with ordering and offline delivery, scaling to millions of live connections.
The idea
Persistent connections for realtime, a queue for reliable ordered delivery, a store for history.
How it works
Clients hold WebSocket connections to a horizontally-scaled WS tier behind an L4 LB. Because connections are stateful, a routing registry (in Redis) maps user → which WS node holds their socket, so a message for user B reaches the node owning B's connection — or is persisted for offline push. Ordering is kept with a per-conversation sequence number, not a global clock. History lands in a wide-column store partitioned by conversation_id; presence and routing live in a fast cache.
The tradeoff
WebSockets are stateful, so the hard problems are connection routing (who holds whom) and graceful reconnection (resume from the last seen sequence). Group chat is a fan-out: a message to a 1000-member group is 1000 deliveries — fine inline for small groups, but huge groups need fan-out workers. Presence (online/typing) is deceptively expensive — naive broadcast is O(contacts); throttle and batch it.
In the wild
WhatsApp, Slack, Discord all follow this shape.
Interview deep dive
Flow
- Client opens a WebSocket to a node; registry records user → node.
- Sender's message gets a per-conversation sequence number.
- Router delivers to the recipient's node, or stores for offline push.
- Client reconnects and resumes from its last seen sequence.
Watch for
- Stateful connections need a registry + reconnect/resume story.
- Big-group fan-out (1 msg → N deliveries) needs workers, not inline.
- Presence broadcast scales with contacts — throttle and batch it.
Interviewer trap
Lead with connection routing and per-conversation ordering, not just "use WebSockets".