Stock Price Feed
Push live price ticks to millions of subscribed clients with minimal latency.
Open the interactive version → diagrams, practice & moreRequirements
Functional
- Ingest market data
- Subscribe to symbols
- Push updates
- Throttle/conflate
Non-functional
- Very low latency
- Massive fan-out
Scale
Millions of subscribers
The approach
Ingest the firehose; pub/sub per symbol; WS gateways fan out to subscribers; conflation (send only the latest price per interval) prevents overwhelming clients on hot symbols.
Key components
Market feed → per-symbol pub/sub → WS gateways → clients
Numbers that matter
- The NYSE/Nasdaq consolidated tape generates ~1 million quotes and trades per second at peak; a full-depth order book feed (ITCH protocol) runs at ~5 Gbps of raw data.
- A WebSocket gateway node can hold ~50,000–100,000 concurrent connections comfortably on a modern Linux box (tuned file descriptors, SO_REUSEPORT); beyond that, add nodes.
- Market data vendors conflate to 250ms–1s intervals for retail consumers; institutional direct feeds operate at sub-100μs latency using kernel-bypass networking (DPDK, RDMA).
- Redis Pub/Sub can sustain ~1 million messages/second on a single node for small payloads; at stock-feed scale with thousands of symbols, Kafka partitioned by symbol is more operationally robust.
Senior deep-dive
Conflation is the core design decision — you cannot push every tick to every subscriber; you push the latest price per symbol per delivery interval.
Fan-out at the symbol level, not the subscriber level: one pub/sub topic per ticker means a subscriber to AAPL only receives AAPL ticks, and a hot symbol's volume doesn't drown cold ones.
WebSocket connection count is the ops constraint: a million subscribers on one feed require connection-aware load balancing (sticky routing per symbol shard) and WebSocket gateway horizontal scaling.
Ingest: normalizing heterogeneous exchange feeds
Raw exchange feeds (ITCH, OPRA, FIX) are binary, exchange-specific protocols. A normalizer layer decodes them, emits canonical trade/quote events, and publishes to a Kafka topic partitioned by symbol. This layer must be single-writer per symbol to preserve tick ordering — multiple normalizers for the same exchange create race conditions on quote sequence numbers.
Pub/sub topology: one topic per symbol or grouped
A topic per symbol is clean but creates thousands of Kafka partitions (problematic for broker metadata). The production pattern is symbol-hash partitioned topics — a fixed number of topics (e.g., 64) where partition = hash(symbol) % 64. Subscribers filter by symbol within the partition. Hot symbols (AAPL, SPY) dominate their partition — monitor partition lag per-symbol, not just per-partition.
Conflation: don't send every tick
SPY can tick thousands of times per second. Sending every update to retail WebSocket clients is unnecessary and saturates connections. Conflation gates: the gateway holds a "dirty" flag per symbol per subscriber; at each delivery interval (e.g., 250ms) it flushes the latest value and clears the flag. This means a subscriber sees at most 4 updates/second on any symbol regardless of tick rate — sufficient for retail UI and a 100x bandwidth reduction.
Snapshot on connect and reconnect
A newly connected client must receive current prices before ticks start arriving. The snapshot store (Redis hash: symbol → {price, timestamp, volume}) is updated by a separate consumer on every tick. On subscribe, the gateway does a Redis HMGET for all requested symbols and pushes the snapshot, then starts the live subscription. Without the snapshot, a client subscribing to an illiquid symbol could wait minutes for the first tick.
Connection management and backpressure
A slow WebSocket client (overloaded browser tab, poor mobile network) accumulates unread messages in the gateway's send buffer. Per-client send buffer caps drop oldest messages when full — for a price feed, the consumer losing old prices is fine; they need only the latest. Track per-connection drop rates as a signal; persistent drops mean the delivery interval needs to widen for that client or the connection should be shed.
What breaks at scale
Market open at 9:30am ET is the hardest operational moment: every subscriber reconnects simultaneously (overnight outage recovery), ticks spike 10x above baseline, and your snapshot store gets a thundering herd read. Stagger reconnection with jitter on the client side and pre-warm the snapshot store before open. The second failure mode is clock skew in sequence numbers: ticks arriving out of order from an exchange normalizer that lost and regained connectivity must be sequence-validated and dropped if stale, not reordered — reordering introduces false price movements.
In production
Bloomberg's B-PIPE and Refinitiv Elektron are the production benchmarks: they deliver tens of millions of updates/day to institutional clients via multicast within a co-location zone and unicast WebSocket to remote clients. Robinhood's retail feed uses Kafka topics per symbol group feeding a fleet of WebSocket gateways; conflation happens at the gateway — the gateway holds a per-symbol "latest price" cell and flushes to subscribers on a ticker. The real challenge is snapshot on connect: when a new subscriber connects, they need the current price for all subscribed symbols instantly (before the next tick arrives) — this requires a snapshot store (Redis hash of symbol → last price) separate from the live pub/sub stream. Without it, subscribers start blind and may miss prices on illiquid symbols that rarely tick.
Common mistakes
- Sending every tick to every client
- Per-client computation of updates
- TCP-buffer bloat without conflation