Flash Sale / Limited Inventory
Sell N limited items to a massive simultaneous crowd without overselling.
Open the interactive version → diagrams, practice & moreRequirements
Functional
- Decrement stock atomically
- Queue/waiting room
- Fairness
- Checkout
Non-functional
- No oversell
- Survive 100×+ spike
Scale
Millions hitting one SKU at once
The approach
Pre-load stock into an atomic counter (Redis DECR) as the source of truth for "got one"; a waiting room/queue throttles entry; confirmed reservations flow to the DB for checkout; reject fast when sold out.
Key components
Waiting room → atomic stock counter (Redis) → order queue → DB
Numbers that matter
- Redis DECR on a single key handles ~100k–500k ops/sec on a single instance — enough for most flash sales; beyond that, you shard counters by item variant.
- P99 checkout latency target is ~500ms–1s for the happy path; the waiting room must enforce this by metering entry to match that throughput.
- A short reservation TTL of 5–10 minutes is the industry standard (Ticketmaster, Amazon) — enough for checkout, short enough to recycle abandoned holds quickly.
- False negatives (oversell) cost 10–100× more to resolve than false positives (telling someone it's sold out when 1 remains) — tune the counter conservatively.
Senior deep-dive
The atomic counter in Redis is the inventory source of truth — the DB is for confirmed orders, not for answering 'is stock available'; mixing the two is how you oversell.
A virtual waiting room (queue/token bucket) is the architectural keystone: without it, a 100k-user spike hits your checkout stack directly and nothing survives — the queue is not a UX nicety, it's load shedding.
Reject fast at the edge: a sold-out flag in a CDN edge cache or API gateway turns a database-hammering avalanche into a sub-millisecond 'sold out' response for the long tail of too-late requests.
Atomic counter: the single right answer for inventory
SELECT + UPDATE under a lock is how textbooks describe it; it's also the path to deadlocks at 10k RPS. Redis DECR (or DECRBY) is atomic by design — the single-threaded event loop serializes all operations on the key. Wrap the DECR + threshold check in a Lua script to make them a single atomic operation: decrement, check if ≥ 0, return success or rollback the decrement. This guarantees no oversell without any locking on the application side.
Virtual waiting room: the real load-shedding mechanism
The waiting room is not about fairness, it's about protecting backend capacity. Issue a signed, time-bounded waiting-room token to every arriving user. A rate-limiter admits N tokens per second to the checkout flow — N is chosen to match the RPS your checkout stack can handle at target latency. The waiting room itself must be lightweight (static HTML + a polling endpoint): it must stay up even when everything behind it is at capacity. Token bucket or leaky bucket at the admission point is the correct primitive.
Sold-out propagation: kill the tail load immediately
When the counter hits zero, publish a sold-out event to a fast propagation channel (Redis pub/sub or an edge cache purge). API gateway and CDN edge rules cache the sold-out state with a short TTL (5–30 seconds) and return a pre-baked 'sold out' response without hitting origin. This single optimization collapses the residual load from millions of hopeful retries from a backend problem into a CDN-layer concern. Without it, sold-out traffic can be worse than in-stock traffic.
Reservation expiry: TTL is the correctness mechanism
A user who abandons checkout must release their reservation. A short TTL on the reservation record (stored in Redis or a fast DB) plus a background sweeper that increments the counter on expiry is cleaner than relying on explicit cancellation. The failure mode: the sweeper falls behind under load, causing reserved-but-expired inventory to be unavailable. Solution: run multiple sweeper workers and use a sorted set (ZSET by expiry timestamp) for O(log n) range queries of expired reservations.
Consistency between Redis counter and DB orders
Redis is the fast gate; the DB is the authoritative record of confirmed orders. After a successful DECR (reservation granted), the user proceeds to checkout; on payment confirmation, write the order to the DB. If payment fails, INCR the counter to release the reservation — this is the compensating action. The danger: a crash between payment success and the order write creates a ghost reservation — a paid user with no order. The fix: write the order first, then process payment, or use an idempotency key to detect and recover the order on retry.
What breaks at scale
Counter sharding becomes necessary when a single Redis key for a hot SKU saturates one CPU core's throughput (~500k ops/sec). Shard the counter into N keys (counter:item:0 through counter:item:N-1), use consistent hashing to route requests, and sum shards only for display purposes. The edge case: the last few units may span multiple shards — if shard 0 has 1 unit left and shard 1 has 1 unit left, two users could simultaneously decrement different shards and both succeed, yielding 2 sales on 1 remaining unit. Drain shards sequentially near zero or accept a small oversell margin and reconcile.
In production
Amazon Lightning Deals and Nike SNKRS use a pre-loaded Redis counter with Lua script DECR-and-check as the reservation gate, backed by a queue that smooths the thundering herd into a metered checkout flow. Ticketmaster's virtual waiting room issues waiting-room tokens via a separate service and admits users in batches timed to checkout capacity. The real engineering challenge is the transition from 'in stock' to 'sold out': that boundary is a high-contention moment where thousands of requests are simultaneously decrementing the counter toward zero — Lua scripts ensure atomicity, but the key must be on a single Redis shard, making shard selection critical for multi-SKU sales.
Common mistakes
- Read-stock-then-write (race → oversell)
- Letting the full crowd hit the DB
- No fast "sold out" short-circuit