What is Rate Limiting?

Definition

Capping how many requests a client can make in a window to protect a service from overload or abuse.

How it works

Implemented with token-bucket (allows bursts) or sliding-window counters, usually in a shared store (Redis) so the limit holds across many servers. Return HTTP 429 with a Retry-After header. Protects against scrapers, runaway clients and thundering herds; the limit lives at the gateway or load balancer.

Learn more on SystemLore

Common questions

Capping how many requests a client can make in a window to protect a service from overload or abuse.

How does Rate Limiting work?

Implemented with token-bucket (allows bursts) or sliding-window counters, usually in a shared store (Redis) so the limit holds across many servers. Return HTTP 429 with a Retry-After header. Protects against scrapers, runaway clients and thundering herds; the limit lives at the…