Academy · Real-World System Designs

Design a chat app

Real-time 1:1 and group messaging with ordering and offline delivery, scaling to millions of live connections.

Open the interactive version → diagrams, practice & more

The problem

Real-time 1:1 and group messaging with ordering and offline delivery, scaling to millions of live connections.

The idea

Persistent connections for realtime, a queue for reliable ordered delivery, a store for history.

How it works

Clients hold WebSocket connections to a horizontally-scaled WS tier behind an L4 LB. Because connections are stateful, a routing registry (in Redis) maps user → which WS node holds their socket, so a message for user B reaches the node owning B's connection — or is persisted for offline push. Ordering is kept with a per-conversation sequence number, not a global clock. History lands in a wide-column store partitioned by conversation_id; presence and routing live in a fast cache.

The tradeoff

WebSockets are stateful, so the hard problems are connection routing (who holds whom) and graceful reconnection (resume from the last seen sequence). Group chat is a fan-out: a message to a 1000-member group is 1000 deliveries — fine inline for small groups, but huge groups need fan-out workers. Presence (online/typing) is deceptively expensive — naive broadcast is O(contacts); throttle and batch it.

In the wild

WhatsApp, Slack, Discord all follow this shape.

Interview deep dive

Flow

  1. Client opens a WebSocket to a node; registry records user → node.
  2. Sender's message gets a per-conversation sequence number.
  3. Router delivers to the recipient's node, or stores for offline push.
  4. Client reconnects and resumes from its last seen sequence.

Watch for

Interviewer trap

Lead with connection routing and per-conversation ordering, not just "use WebSockets".

Related Academy

Part of Academy on SystemLore — system design interview prep with 148 deep topics, interactive diagrams, and a practice game. Practice this one →