System Design Library

Inventory Management

Track stock across warehouses with no oversell under high concurrency.

Open the interactive version → diagrams, practice & more

Requirements

Functional

Stock levels per SKU/location
Reserve/release
Replenishment
Multi-warehouse

Non-functional

No oversell (consistent)
High throughput

Scale

Millions of SKUs, high order rate

The approach

Authoritative stock counts with atomic decrement on reserve (DB row lock or atomic counter); time-boxed reservations released on timeout; sharded by SKU/warehouse; an event log feeds analytics/replenishment.

Key components

Order → reservation service (atomic) → stock store · event log

Numbers that matter

A single Redis DECR is ~0.1ms round-trip — fast enough to be on the purchase hot path without a queue.
Reservation TTLs of 5–15 minutes are industry standard; anything longer leaks inventory visibility during flash sales.
Amazon models ~1 billion SKU-warehouse pairs; sharding by (warehouse_id, sku_id) keeps hot-item writes to a single shard.
Batch reconciliation jobs that cross-check event logs vs counters run every 5–15 minutes to catch drift from crashes or split-brain.

Senior deep-dive

Atomic decrement is the only correct source of truth — every other consistency trick is built on top of it.

Time-boxed reservations are mandatory: unheld reservations from crashed clients will oversell you if you rely on eventual cleanup.

Warehouse-level sharding by SKU isolates hot SKUs but means cross-warehouse availability queries are scatter-gather — pre-aggregate or accept staleness.

Atomic decrement, not SELECT + UPDATE

The classic bug is `SELECT stock WHERE sku=X` followed by `UPDATE stock SET qty=qty-1` — a race window that causes oversell under any concurrency. Redis DECR or a DB CHECK constraint (`qty >= 0`) paired with an atomic decrement is the correct primitive. If qty hits -1 the operation is rejected; you never go negative. Optimistic locking (CAS on a version number) works at lower throughput but adds a read-then-write round trip.

Reservation TTL: the escape valve for crashed clients

A bare atomic decrement without a reservation record means a user who crashes mid-checkout holds inventory forever. Time-boxed reservations (a record with `expires_at`) let a sweeper job restore quantity after the TTL. The tricky part: the sweeper must `INCR` atomically and delete the reservation in one transaction, or a race with a concurrent checkout re-decrements what was just restored. Lua script or DB transaction is mandatory here.

Sharding strategy for hot SKUs

Sharding by `(warehouse_id % N, sku_id % M)` distributes writes but creates cross-shard scatter-gather for queries like "show me total available stock across all warehouses." Counter aggregation tables pre-roll up per-SKU totals asynchronously — they're eventually consistent by design. For flash sales on a single viral SKU, shard the counter itself (N sub-counters summed on read) to spread the write load across Redis nodes.

Multi-warehouse saga: the hard cross-shard case

A cart containing items from multiple fulfillment centers requires reserving from each warehouse atomically or rolling back. A distributed 2PC works but blocks on coordinator failure. In practice, a Saga with compensating decrements is preferred: reserve warehouse A, then B, then C; if C fails, issue compensating `INCR` to A and B. The window between compensate and confirm is where double-sell bugs hide — idempotency keys on every reservation op prevent replay amplification.

Event log as the reconciliation source of truth

The live counter is the fast path; the append-only event log (every reservation, confirmation, cancel) is the audit path. A background reconciliation job replays the log and compares to the counter; discrepancies mean a bug — not a business rule. This pattern is how you pass SOX/PCI audits and also how you recover after a Redis failover that lost the last few seconds of writes.

What breaks at scale

Hot-SKU thundering herd is the first failure mode: 10k concurrent requests for a single SKU serialize on one Redis key even with pipelining, saturating the connection pool. Counter sharding (split into N sub-keys, sum on read) is the fix but complicates consistency. The second failure is reservation-leak accumulation: if the TTL sweeper falls behind — common during traffic spikes — expired reservations pile up and inventory appears artificially depleted until the sweeper catches up. A dedicated sweeper with its own rate limit and backpressure is non-negotiable.

In production

Amazon uses a combination of DynamoDB conditional writes (optimistic lock via version attribute) and pre-allocated reservation records to prevent oversell — the reservation itself is the lock. Shopify routes flash-sale SKUs through a queue + Redis atomic counter pair to serialize demand, then confirms asynchronously. The real challenge is multi-warehouse allocation: when a customer's cart spans three fulfillment centers, you need distributed reservation across three shards with a rollback saga if any shard runs out, and compensating the partial hold within the TTL window is where most bugs live.

Common mistakes

Read-then-write stock (oversell race)
Reservations without timeout (phantom stock)
Global lock on all inventory

Related System Design Library

Part of System Design Library on SystemLore — system design interview prep with 148 deep topics, interactive diagrams, and a practice game. Practice this one →