The request/response model
Everything online is one machine asking another for something. If you don't understand that round trip, nothing else makes sense.
Open the interactive version → diagrams, practice & moreThe problem
Everything online is one machine asking another for something. If you don't understand that round trip, nothing else makes sense.
The idea
A client sends a request over a network; a server does work and sends a response back. The whole internet is this, repeated at scale.
How it works
A request crosses several layers, each costing time: DNS resolution, a TCP handshake (1 RTT), TLS (1–2 more RTTs), then the HTTP request itself. The server routes to code that may fan out to a database or other services — and those hops are often serial, so their latencies add up. HTTP keep-alive and connection pooling amortize setup; HTTP/2 multiplexes many requests over one connection; HTTP/3 (QUIC) removes the head-of-line blocking that stalls HTTP/2 under packet loss.
The tradeoff
Every round trip is bounded by the speed of light — ~1ms per 100km, unavoidable. The lever you control is the number of serial round trips: collapse N sequential calls into one batched call or a parallel fan-out, and reuse warm connections so you pay the handshake once, not per request.
In the wild
A browser loading this page made dozens of these round trips (HTML, CSS, JS, images), reusing connections and fetching in parallel.
Interview deep dive
Flow
- Resolve DNS, open a TCP connection, complete the TLS handshake.
- Send the HTTP request; the server routes it to handler code.
- Handler may call DB/services — serial calls add their latencies.
- Response returns over the warm connection; keep-alive reuses it.
Watch for
- Serial dependent calls add latency; parallelize independent ones.
- A cold connection pays DNS+TCP+TLS before any useful work.
- Tail latency (p99) tracks the slowest hop, not the average.
Interviewer trap
Count the serial round trips in your design and name which you can batch or parallelize.