Design video streaming (YouTube)
Upload, store, transcode, and stream video to millions globally on every device and network.
Open the interactive version → diagrams, practice & moreThe problem
Upload, store, transcode, and stream video to millions globally on every device and network.
The idea
Async transcoding pipeline + object storage + CDN delivery; metadata kept separate from blobs.
How it works
Upload (resumable, chunked) lands in object storage and enqueues a transcode job; workers build an ABR ladder — the same video at many bitrates/resolutions, segmented into ~2–10s chunks (HLS/DASH) so the player adapts to bandwidth mid-stream. Chunks serve from a CDN with an ISP-embedded edge tier; the origin is never in the hot path. Metadata (title, owner, view count) lives in a DB + cache, separate from the blobs. View counts aggregate async through a queue, never a synchronous per-view increment.
The tradeoff
Transcoding is heavy and async, so a video isn't instantly watchable — you trade immediacy for delivery efficiency. *Storage and egress dominate cost* (one source becomes many renditions × many chunks), which is why hot content is pushed to ISP edges and cold content tiers to cheaper storage. The metadata/blob split lets each scale independently — small hot rows vs huge cold objects.
In the wild
YouTube, Netflix (which caches video inside ISP networks).
Interview deep dive
Flow
- Resumable chunked upload lands in object storage.
- Transcode workers build an ABR ladder of segmented renditions.
- Chunks serve from CDN → ISP edge; origin stays cold.
- Player adapts bitrate per chunk to current bandwidth.
Watch for
- Storage + egress dominate cost — tier cold content, edge hot content.
- Transcoding is async — video isn't watchable on upload.
- Synchronous per-view DB increments don't scale — aggregate via a queue.
Interviewer trap
Name ABR chunking and that egress/storage, not compute, is the dominant cost.