Realtime Analytics Dashboard
Live dashboards over high-volume event streams (clicks, metrics).
Open the interactive version → diagrams, practice & moreRequirements
Functional
- Ingest events
- Windowed aggregations
- Live dashboards
- Drill-down
Non-functional
- Seconds-fresh
- High throughput
Scale
Millions of events/sec
The approach
Events → Kafka → stream processor (Flink) computes windowed aggregates → OLAP/columnar store (Druid/ClickHouse) for fast slice-and-dice queries; dashboards query the OLAP store.
Key components
Producers → Kafka → stream processor → OLAP store → dashboard
Numbers that matter
- Apache Flink can sustain 1–10 million events/second per TaskManager with well-tuned parallelism and back-pressure; ClickHouse ingests at 500K–1M rows/sec per node in bulk insert mode.
- ClickHouse columnar compression achieves 5–10x compression ratios on typical event streams, reducing a 1 TB/day raw log to ~100–200 GB on disk.
- End-to-end latency (event → visible on dashboard) is typically 5–30 seconds for stream-processed pipelines; dropping to <2 seconds requires careful Kafka partition sizing and in-memory materialization.
- Druid's real-time indexing nodes hold the last ~1 hour of data in memory before flushing to deep storage — querying older data incurs a 2–5x latency penalty as segments are loaded from S3.
Senior deep-dive
The query-latency contract forces you to choose your storage layer before anything else — raw Kafka offsets for seconds-fresh but hard to query, or a pre-aggregated OLAP store for interactive dashboards but with seconds-to-minutes of lag.
Flink windowed aggregations are the bridge — they consume the stream and materialize results into ClickHouse/Druid so dashboards don't hit raw logs.
Cardinality is the silent cost driver: high-cardinality dimensions (user_id in group-bys) explode storage and slow queries — pre-aggregate at ingest and cap raw retention.
Pipeline architecture: Kafka → Flink → OLAP
Kafka is the durable buffer that decouples ingest rate from processing rate and enables consumer replay. Flink reads partitions, applies tumbling or sliding windows (e.g., 1-minute clickthrough rates), and writes aggregated rows to ClickHouse. The key design choice is aggregation granularity at write time — finer granularity preserves query flexibility but inflates storage and query fan-out.
Watermarks and late data
Events arrive out of order — network jitter, mobile clients, CDN edge buffering. Event-time processing requires a watermark that advances as the stream progresses, signaling "all events before T have arrived." Flink's allowed lateness parameter holds windows open for a configurable extra period; events after that are either dropped or routed to a side output for reconciliation. Setting this too tight misses legitimate late events; too loose delays dashboard refresh.
Exactly-once semantics and idempotent sinks
Flink checkpoints operator state to durable storage (S3/HDFS) at intervals; on restart it replays from the last checkpoint. The sink must be idempotent (ClickHouse deduplication by insert ID, or upsert by primary key) or support two-phase commit so retried writes don't double-count. This is the most under-tested failure scenario — test by killing Flink mid-window and verifying counts match a batch recount.
OLAP layer: ClickHouse vs Druid vs Pinot
ClickHouse wins for ad-hoc SQL and simple deployments — great compression, vectorized execution, but upserts are expensive (ReplacingMergeTree is eventually consistent). Druid shines for time-series roll-ups with pre-aggregated datasources but adds operational complexity (6+ node types). Pinot handles upserts + real-time ingestion better but requires more schema discipline upfront. Choose based on whether you need exact counts or approximate are acceptable — HLL in Druid/Pinot can answer "unique users" in milliseconds at the cost of ~2% error.
Dashboards and query patterns
Dashboards are high-fan-out readers: a single page load might fire 10–20 queries simultaneously. Cache aggressively at the dashboard layer (Grafana query cache, materialized views) and ensure the OLAP store has columnar pruning — queries that filter by time range should skip entire partitions. Pre-materializing the most common groupings at ingest time (country + device + hour) dramatically cuts per-query compute but requires knowing the query patterns in advance.
What breaks at scale
Cardinality explosions are the silent killer: adding user_id as a dimension to a Druid datasource can 100x storage and 10x query latency overnight. The fix is strict schema governance — dimensions must be low-cardinality; metrics carry the numbers; high-cardinality analysis belongs in a separate raw log store. The second failure mode is Kafka consumer lag drift: if Flink falls behind, dashboards silently show stale data with no obvious error — instrument `consumer_lag` as a first-class alert.
In production
LinkedIn's Pinot and Uber's AresDB were both built because Druid wasn't fast enough at their specific query patterns (Pinot: real-time OLAP with upserts; AresDB: GPU-accelerated time-series joins). Cloudflare runs ClickHouse for its analytics dashboard and publishes that it handles ~4 million inserts/second at peak across a cluster. The real engineering challenge is late-arriving events: a mobile client with intermittent connectivity may send events 5 minutes late — Flink's watermarking + allowed lateness handles this, but you must decide how long to hold a window open before emitting, trading freshness against completeness. Dashboards that look live but are secretly stale by 10+ minutes due to a pipeline backlog are the most common production failure mode.
Common mistakes
- Aggregating in a row-store OLTP DB
- No pre-aggregation (slow dashboards)
- Ignoring late/out-of-order events