System Design Library

Image Hosting (Imgur)

Upload, store, and serve images globally with multiple sizes.

Open the interactive version → diagrams, practice & more

Requirements

Functional

Upload image
Serve by URL
Thumbnails/sizes
Optional albums

Non-functional

Fast global delivery
Durable

Scale

Billions of images

The approach

Store originals in object storage; generate resized renditions async; serve everything via CDN; metadata in a small DB.

Key components

Upload → object store + queue → resize workers → CDN · metadata DB

Numbers that matter

A typical image hosting platform serves a read:write ratio of ~1,000:1 or higher — a viral image uploaded once gets fetched millions of times; the architecture is almost entirely a read optimization problem.
A modern CDN edge (Cloudflare, Fastly) serves cached images at <10 ms P50 latency from edge PoPs; a CDN cache miss (origin fetch + cache fill) adds ~100–500 ms depending on origin region.
Generating 5 image variants (thumbnail, small, medium, large, original) via ImageMagick or libvips takes ~200–2,000 ms depending on image size and complexity — far too slow for a synchronous upload response.
Imgur's peak load (around 2013–2016 viral days) reached ~430 Gbps outbound traffic — entirely CDN-served; the origin infrastructure handled only a tiny fraction (cache misses + uploads).

Senior deep-dive

Object storage + CDN is 100% of the serving architecture: the application servers never touch image bytes after upload — originals land in S3, resized variants are generated async, and every read hits a CDN edge.

Async transcoding is non-negotiable: generating 5 sizes synchronously on upload adds seconds to the user-facing request — push to a queue, generate in background workers, and serve originals until variants are ready (or a placeholder).

Deduplication saves meaningful storage: content-addressed storage (SHA-256 of the file) catches re-uploads of the same image — common for memes, viral images, and bot uploads — at essentially zero read cost.

Upload path: synchronous vs async transcoding

On upload, only two things should happen synchronously: validate the file (magic bytes, size limit, content policy) and store the original to object storage. Everything else — virus scan, NSFW classification, generating 5 resized variants, extracting EXIF, computing a perceptual hash — goes to a queue-backed worker fleet. The client gets a URL to the original immediately. Variants appear as workers complete them; the CDN serves the original until a variant is ready. This keeps P99 upload latency under 2 seconds regardless of image complexity. The failure mode is a transcoding worker dying mid-job — make jobs idempotent (regenerating variants is safe) and use a visibility timeout to re-enqueue.

Content addressing: dedup before storing

Before writing a new upload to object storage, compute SHA-256 of the raw file bytes and check if that hash already exists. If so, return the existing image URL immediately — no storage write, no transcoding job. This is content-addressed storage (like git blobs). For image hosting platforms, deduplication rates of 10–40% are common for meme-heavy content where the same image is re-uploaded constantly. The implementation is a `hash → storage_path` table in a fast KV store (Redis or DynamoDB). Caution: two different users uploading the same image share the same object — a delete by one user must not remove the object if another user still references it (reference counting or soft-delete with GC).

CDN strategy: push vs pull, and cache keys

Pull CDN (origin-pull): the CDN fetches from origin on the first request per edge PoP and caches locally. Simple to configure, self-populating. Push CDN: you proactively push objects to all edge nodes on upload. Push is only worth it for guaranteed-popular content known at upload time; pull is correct for everything else (you don't know which images go viral). Cache key design matters: `https://i.imgur.com/abc123.jpg` caches as one object; `?v=1` parameters bust the cache but create new cache entries. Use immutable paths with content hashes for versioning — never use query parameters for cache-busting (CDNs handle them inconsistently and it doubles storage).

Format conversion: the WebP/AVIF migration challenge

Serving images in WebP (25–35% smaller than JPEG) or AVIF (50% smaller) reduces bandwidth significantly at CDN scale. The implementation: transcode to multiple formats during the async worker step, then serve the optimal format based on the client's Accept header (`image/avif, image/webp`). CDN must vary the cache key by Accept header (Vary: Accept) or serve all clients the same format — most CDNs support this but it multiplies cache storage per image by the number of formats. At Imgur's scale (~1 trillion cached objects historically), format multiplication requires careful CDN storage capacity planning.

Metadata store: small but critical

Each image needs a metadata record: `image_id`, `user_id`, `storage_path`, `content_hash`, `size`, `width`, `height`, `content_type`, `upload_timestamp`, `is_deleted`. This metadata is accessed on every image page load (to construct the CDN URL and display metadata), so it must be fast and highly available. For Imgur's scale (~50 billion images), a sharded MySQL or DynamoDB table works well — reads dominate (every page view), writes are once per upload. The metadata store does NOT store image bytes — it's tiny (100–200 bytes per image) compared to the image corpus.

What breaks at scale

CDN cache miss storms on viral upload: a new image uploaded at 3am that hits Reddit's front page at 8am starts receiving 500K requests/second before the CDN has cached it — every request goes to origin simultaneously. Mitigation: request collapsing (CDN holds all concurrent misses for the same URL and makes only one origin request) and pre-warming (detect viral content in the transcoding pipeline and proactively push to edge nodes). Storage explosion from abandoned variants: when a format is deprecated or a size tier removed, millions of old variant files accumulate in S3 costing money — lifecycle rules to delete variants older than N days (while keeping originals forever) are essential. DMCA/takedown latency: CDN TTLs of 24 hours mean a takedown request removes the object from origin but serves the cached copy for a day — surrogate key purges or short TTLs for user content (at CDN bandwidth cost) are the tradeoff.

In production

Imgur, Cloudinary, and imgix all converge on the same architecture: S3 for originals, async transcoding workers (often FFmpeg/libvips containers), and a CDN in front of everything. Cloudinary's differentiator is on-the-fly transformations via URL parameters (resize, crop, format conversion via the CDN URL itself), backed by a cache — the first request generates the variant, subsequent requests hit cache. The real engineering challenge is CDN cache invalidation for updated/deleted images: a deleted image may continue to be served from CDN edges for hours (up to the configured TTL), which is a real legal/DMCA problem. Production systems use surrogate keys (cache tags) to purge a specific image from all edge nodes simultaneously, or use immutable versioned URLs (no invalidation needed — just stop serving the URL).

Common mistakes

Resizing on every request
Serving from origin
No abuse scanning

Related System Design Library

Part of System Design Library on SystemLore — system design interview prep with 148 deep topics, interactive diagrams, and a practice game. Practice this one →