Academy · Reliability & Observability

Rate limiting & load shedding

A buggy client, a scraper, or a traffic spike can exhaust your capacity and take everyone down.

Open the interactive version → diagrams, practice & more

The problem

A buggy client, a scraper, or a traffic spike can exhaust your capacity and take everyone down.

The idea

Cap how much any client can consume, and shed excess load gracefully.

How it works

Cap per-client consumption and shed excess gracefully. Token bucket allows bursts up to the bucket size then steadies to the refill rate; leaky bucket smooths to a constant rate; fixed-window counters are cheap but allow a 2× burst across the window boundary; sliding-window fixes that at more cost. Distributed limiters share counts via an atomic Redis INCR/Lua script (accurate, adds a hop) or approximate with per-node local buckets synced periodically (fast, slightly leaky). Under overload, shed low-priority work and fail fast.

The tradeoff

Limits protect capacity but can reject legitimate bursts — token bucket's burst allowance is the usual compromise. Centralized counting is exact but adds latency and a Redis dependency on the hot path; local approximation is fast but lets clients slightly exceed. A key choice: fail-open (pass traffic if the limiter is down — protects UX, risks overload) vs fail-closed (reject — protects the backend). Return 429 + Retry-After so clients back off.

In the wild

Every public API (Stripe, GitHub, Twitter) rate-limits and returns 429s.

Interview deep dive

Flow

  1. Pick the algorithm (token bucket for bursts, sliding window for smoothness).
  2. Key the limit by IP/user/API-key at the gateway.
  3. Share counts via atomic Redis, or approximate locally.
  4. On overload, shed low-priority load and return 429 + Retry-After.

Watch for

Interviewer trap

Name the algorithm AND the distributed-counting choice, plus fail-open vs fail-closed.

Related Academy

Part of Academy on SystemLore — system design interview prep with 148 deep topics, interactive diagrams, and a practice game. Practice this one →