Design a Job Scheduler
viaLeetCode
Problem Design a distributed job scheduler: run one-off jobs at a given time and recurring (cron-like) jobs, reliably, across a worker fleet.
Functional requirements
- submit(job, runAt | cronExpr), cancel, job status/history; per-job retry policy with backoff; priorities; misfire handling (what happens to runs missed during downtime).
Non-functional requirements
- Scale to discuss: millions of scheduled jobs, thousands of executions/sec at peak; a job must run even if the node that scheduled it dies; no duplicate concurrent runs of the same job (or make duplicates safe).
Key components
- Job store: jobs(id, schedule, payload, state, next_run_at, owner, attempt) — an indexed next_run_at is the core query surface.
- Trigger layer: partitioned pollers scanning
WHERE next_run_at <= now AND state='scheduled'with row-claiming (SELECT … FOR UPDATE SKIP LOCKED / CAS on state) — the claim is what prevents double-firing; timing wheel/delay queue in memory for fine granularity. - Dispatch: claimed jobs → execution queue → worker pool; workers heartbeat + lease per running job (lease expiry ⇒ requeue — handles dead workers); recurring jobs compute next_run_at on completion (or fixed-rate at claim).
- Coordinator/partitioning: shard jobs by hash(job_id) across scheduler instances; leader election or partition assignment (ZK/etcd-style) so each shard has exactly one active poller.
Deep dives / trade-offs
- Exactly-once is a lie at the execution level: offer at-least-once + idempotent handlers (idempotency keys), or at-most-once with acceptable loss — articulate the choice.
- Retry storms and poison jobs: exponential backoff + DLQ; per-queue rate limits.
- Clock skew across nodes, misfire policies (fire-immediately vs skip), and hot-shard mitigation for popular run times (e.g., midnight cron herds — add jitter).
asked …