Google Calendar
Events, recurrence, invitations, reminders and free/busy across time zones.
Open the interactive version → diagrams, practice & moreRequirements
Functional
- CRUD events
- Recurring events
- Invites/RSVP
- Reminders
- Free/busy & scheduling
Non-functional
- Correct across time zones/DST
- Reliable reminders
Scale
Billions of events
The approach
Store recurrence as a rule (RRULE) + exceptions, not expanded rows; expand on read within a window. Reminders go through a scheduled-notification system. Times stored in UTC + original time zone.
Key components
App → event DB · reminder scheduler (timer wheel/queue) · notification service
Numbers that matter
- A single recurring daily event for 1 year would create 365 rows if expanded — a weekly recurring event for 10 years is only ~520 rows, but a 'every weekday' rule for 20 years is ~5,200 rows; expansion on write is fine for short series, prohibitive for perpetual ones.
- Google Calendar stores recurrences as RRULE strings (RFC 5545, typically <200 bytes) and expands a maximum window of ~2 years ahead for reminder scheduling.
- Free/busy queries across an organization of 1,000 users to find a meeting slot must merge up to 1,000 expanded calendar streams — responses must arrive in <500ms, which requires precomputed free/busy materialized views.
- Reminder fanout latency SLA is typically ±30 seconds of the target time — achieving this at scale requires a distributed timer wheel with ~1-minute bucket granularity across potentially billions of upcoming reminder events.
Senior deep-dive
Recurrence storage is the crux — storing expanded rows for every occurrence is a write bomb; store the RRULE and expand on read within a query window.
All times must be stored in UTC plus the original IANA time zone name (not just offset); without the zone name, DST transitions silently shift events by one hour. Reminders and free/busy are the two hardest read paths — reminders require a durable scheduler, free/busy requires merging expanded recurrences across possibly thousands of calendars.
RRULE + exceptions: the right data model
Store recurrence as a master event row with an RRULE field plus a separate exceptions table keyed by (master_id, original_occurrence_time). An exception row holds the modified fields (or a 'deleted' flag). On read, expand the RRULE within the query window then overlay exceptions — deleted occurrences are removed, modified ones are replaced. Never store both a master and fully-expanded children; you'll have consistency nightmares on edits.
Time zone is not an offset
Storing UTC offset (+05:30) is wrong — offsets change with DST and government decisions (countries change their time zone rules). Always store the IANA time zone name (e.g. 'America/New_York') alongside the UTC timestamp. On read, use the IANA zone to localize. This is especially critical for recurring events: a 9am Monday recurring meeting must stay at 9am local time through DST transitions, which means the UTC time shifts by one hour twice a year.
Reminder scheduling at scale
Reminders require a durable scheduled-notification system (not just a cron job). The standard approach: when an event is created or updated, write reminder triggers into a sorted set keyed by fire_time (Redis ZADD or a purpose-built timer store). A scanner process polls the head of the sorted set, dequeues due reminders, and fires them via the notification service. Perpetual recurring events require scheduling only the next N occurrences and re-scheduling on completion.
Free/busy: the O(n×k) problem
Finding a free slot for a meeting of N attendees requires expanding each attendee's calendar within the query window — O(N × occurrences_in_window). At 50 attendees × 50 events each, that's 2,500 expansions per query. Precompute a free/busy materialized view per user (a bitfield or interval list over the next 60 days) updated on every event write. Free/busy queries then become N bitfield ANDs instead of N RRULE expansions — orders of magnitude faster.
Invitation and RSVP state machine
An invitation is a separate entity from the event itself — the organizer's event is the source of truth; each attendee has an attendee record with status (needsAction / accepted / declined / tentative). On 'edit this and following', the organizer's series splits; attendee records for future occurrences must be re-created or migrated to the new series ID. Forgetting this is the #1 bug in calendar implementations — acceptances get orphaned.
What breaks at scale
The 'edit all future' operation is a distributed write bomb — it must split the master series, create a new master, migrate all future exception rows, re-trigger reminders for all affected attendees, and send update notifications. At a large organization this can touch thousands of attendee records in a single user action. The fix: make series splits async via a job queue with an 'edit in progress' lock on the series, and stream update notifications rather than sending them all synchronously in the user's request.
In production
Google Calendar stores recurrence rules using RFC 5545 RRULE and keeps exceptions (moved/deleted occurrences) as separate override rows keyed by (series_id, original_start_time). The hardest engineering problem is the 'expand on read' performance at free/busy query time: Google's internal implementation materializes a rolling free/busy index that is updated incrementally as events are created or modified, rather than expanding RRULE on every query. Microsoft Exchange/Office 365 uses a similar approach with a pre-expanded occurrence cache that is invalidated on series edit. The real challenge nobody talks about: editing 'this and all following' occurrences requires splitting a recurrence series into two, which is a multi-row transaction with tricky ID semantics.
Common mistakes
- Materializing infinite recurring instances
- Storing only UTC (DST/zone bugs)
- Polling for reminders instead of scheduling