Design a Job Scheduler

viaLeetCode

Problem Design a distributed job scheduler: run one-off jobs at a given time and recurring (cron-like) jobs, reliably, across a worker fleet.

Functional requirements

submit(job, runAt | cronExpr), cancel, job status/history; per-job retry policy with backoff; priorities; misfire handling (what happens to runs missed during downtime).

Non-functional requirements

Scale to discuss: millions of scheduled jobs, thousands of executions/sec at peak; a job must run even if the node that scheduled it dies; no duplicate concurrent runs of the same job (or make duplicates safe).

Key components

Job store: jobs(id, schedule, payload, state, next_run_at, owner, attempt) — an indexed next_run_at is the core query surface.
Trigger layer: partitioned pollers scanning WHERE next_run_at <= now AND state='scheduled' with row-claiming (SELECT … FOR UPDATE SKIP LOCKED / CAS on state) — the claim is what prevents double-firing; timing wheel/delay queue in memory for fine granularity.
Dispatch: claimed jobs → execution queue → worker pool; workers heartbeat + lease per running job (lease expiry ⇒ requeue — handles dead workers); recurring jobs compute next_run_at on completion (or fixed-rate at claim).
Coordinator/partitioning: shard jobs by hash(job_id) across scheduler instances; leader election or partition assignment (ZK/etcd-style) so each shard has exactly one active poller.

Deep dives / trade-offs

Exactly-once is a lie at the execution level: offer at-least-once + idempotent handlers (idempotency keys), or at-most-once with acceptable loss — articulate the choice.
Retry storms and poison jobs: exponential backoff + DLQ; per-queue rate limits.
Clock skew across nodes, misfire policies (fire-immediately vs skip), and hot-shard mitigation for popular run times (e.g., midnight cron herds — add jitter).

Add a follow-up question they asked

No follow-ups yet. Be the first to add one.

asked …