Glossary

Throughput

How much work a system completes per unit time — requests/sec, messages/sec, bytes/sec.

1 min read·4 sections
Open the interactive version → diagrams, practice & more

Definition

How much work a system completes per unit time — requests/sec, messages/sec, bytes/sec.

How it works

Distinct from latency (time per request): a batched, pipelined system can have high throughput and high latency at once. You raise throughput with parallelism, batching and async processing. Little's Law ties them together: concurrency ≈ throughput × latency.

Common questions

What is Throughput?

How much work a system completes per unit time — requests/sec, messages/sec, bytes/sec.

How does Throughput work?

Distinct from latency (time per request): a batched, pipelined system can have high throughput and high latency at once. You raise throughput with parallelism, batching and async processing. Little's Law ties them together: concurrency ≈ throughput × latency.

What is Throughput used for in system design?

Distinct from latency (time per request): a batched, pipelined system can have high throughput and high latency at once. You raise throughput with parallelism, batching and async processing. Little's Law ties them together: concurrency ≈ throughput × latency.

Part of Glossary on SystemLore — system design explained with 148 deep topics, interactive diagrams, and a build-it-yourself game. Browse the glossary and "X vs Y" comparisons, or build this one →