System Design Library

Issue Tracker (Jira)

Track issues with custom workflows, fields, search and permissions.

Open the interactive version → diagrams, practice & more

Requirements

Functional

CRUD issues
Custom fields/workflows
Search/filter (JQL)
Permissions
Notifications

Non-functional

Flexible schema
Fast search

Scale

Large orgs, millions of issues

The approach

Flexible entity model (issues + custom fields, often EAV or JSON columns); a search index (Elasticsearch) for fast filtering; a workflow engine drives state transitions; per-project permissions.

Key components

Issue store (flexible schema) · search index · workflow engine · notifications

Numbers that matter

A large Jira instance can have 10M+ issues across thousands of projects with 100+ custom field types each.
Elasticsearch index size for a large Jira Cloud tenant can reach hundreds of GB when all issue fields, comments, and history are indexed.
JQL full-text search on Jira Cloud must return results in <500ms for interactive use — slower than that and engineers start exporting to spreadsheets.
A single issue can have thousands of comments and complete field-change audit history spanning years — changelog storage dominates disk for large tenants.

Senior deep-dive

The flexible entity model is what makes Jira hard — every project can have different issue types, custom fields, and workflows; you can't schema-design your way out of this without EAV or JSON columns, both of which make indexed search expensive.

Search is the dominant read workload: most teams interact with Jira primarily through JQL queries, and Elasticsearch is what actually serves those — the relational DB is the authoritative store, the search index is the query plane.

Workflow engine correctness matters more than latency: a state transition that violates a workflow rule (moving an issue from Open to Done without a required approver) is a business logic error that erodes trust in the tool faster than any slowness.

Entity model: issues, custom fields, and the EAV trap

Jira issues have a fixed set of system fields (summary, assignee, priority, status) plus an unbounded set of custom fields per project. The naive approach — one column per field — fails immediately. EAV (entity-attribute-value: one row per field value) gives flexibility but turns simple reads into multi-way joins. JSONB columns (Postgres) are the modern solution: store all custom field values as a JSON blob on the issue row, index specific keys with GIN indexes. Querying custom fields is fast for simple equality; range queries on JSON fields still require careful index design.

Search: Elasticsearch as the query plane

The relational DB is authoritative for writes; Elasticsearch serves all JQL queries. On every issue write, a background indexer (or synchronous dual-write) pushes the updated document to Elasticsearch. JQL (Jira Query Language) is a DSL that translates to Elasticsearch query DSL — field filters become term/range queries, text search becomes full-text match. The dual-write pattern means the search index can lag the DB by seconds; showing stale search results right after an edit is expected and tolerated. Re-indexing (full refresh from DB → ES) is a scheduled background job that can take hours on large tenants.

Workflow engine: the state machine with validators

Each project's workflow is a directed graph of statuses and transitions with conditions (who can trigger this), validators (what must be true), and post-functions (what happens after). The engine enforces this graph on every status change: check conditions → run validators → update status → run post-functions. This must be transactional — if a post-function fails (e.g., can't notify Slack), you want the status update to still commit (post-functions are typically best-effort) but not the reverse. Optimistic locking on issue.version prevents two concurrent transitions from both thinking they're starting from the same state.

Permissions: the query-time filter nightmare

Jira has a complex permission model: project-level roles, issue-level security schemes, component-level access. Every search result must be filtered by what the calling user can see. Enforcing this in Elasticsearch means either denormalizing permissions into each document (update every issue document when a permission scheme changes — expensive) or post-filtering results (fetch more than needed, filter in app, return top-k — inconsistent pagination). Jira historically uses a mix: broad ES query + Java-side permission check on results, which means page sizes are unpredictable.

Audit log and changelog: append-only history

Every field change is written to a changelog table (issue_id, field, old_value, new_value, changed_by, changed_at) — this is append-only and grows without bound. On a large active issue this table can have thousands of rows. Pagination is mandatory when fetching history; storing old/new values as text works for strings but breaks for user references if display names change (store user_id, resolve on display). The changelog is not indexed for search — it's accessed by issue ID only.

What breaks at scale

Elasticsearch index lag is the most common support ticket: an issue was just updated but doesn't appear in JQL results for 3–10 seconds. Schema migration on JSONB custom fields is painful — adding a new field type to millions of existing issues requires a background migration that takes hours and blocks indexing. The nastiest failure is permission denormalization drift: when a project's permission scheme changes, all issue documents in Elasticsearch must be re-indexed with updated permission metadata — at 10M issues that's a multi-hour reindex during which permission changes are inconsistent in search results.

In production

Jira Cloud (Atlassian) runs on a multi-tenant architecture where each workspace (site) maps to a logical partition in sharded Postgres for structured data and a per-tenant Elasticsearch index for search. Custom fields are stored as JSON columns (JSONB in Postgres) rather than EAV tables after Atlassian's internal migration circa 2018–2020 — JSONB gives indexability on specific keys without the join explosion of EAV. The real challenge is workflow engine consistency at scale: a workflow transition must validate conditions, execute validators, and fire post-functions (e.g., auto-assign, send notification) atomically with the state update. At high concurrency (CI pipelines mass-transitioning issues), optimistic locking on the issue version number prevents lost updates without serializing all writes.

Common mistakes

Rigid fixed schema
SQL LIKE instead of a search index
Hardcoded workflows

Related System Design Library

Part of System Design Library on SystemLore — system design interview prep with 148 deep topics, interactive diagrams, and a practice game. Practice this one →