Design Hotel Booking Data Population
viaLeetCode
Problem Design the API and the data-population component of a hotel booking system: how hotel/room/availability data gets into the system (from suppliers/vendors) and how it is modeled and served.
Functional requirements
- Ingest hotel content (property details, room types, photos), rates, and availability from external suppliers (feeds/APIs) and internal extranet edits.
- Serve a clean read API: hotel details, room availability for a date range, prices.
Non-functional requirements
- Supplier feeds are large (full snapshots) and incremental (deltas); data must converge despite duplicates and out-of-order updates; serving reads stays fast during ingest.
Key components
- Ingestion pipeline: fetch/receive feed → validate/normalize (schema mapping per supplier) → dedupe/merge policy → write to the canonical store → publish change events to rebuild read models/caches.
- Data model: Hotel(id, name, location, amenities) 1—N RoomType(id, hotel_id, capacity, features) 1—N RatePlan / AvailabilityByDate(room_type_id, date, price, rooms_left).
- API surface: GET /hotels/{id}, GET /hotels/{id}/availability?checkin&checkout&guests, admin/bulk upsert endpoints for the populator.
Deep dives / trade-offs
- Availability by date-row granularity vs ranges: date rows are write-heavy but query-simple — the standard choice.
- Idempotent upserts keyed by supplier record id + version so replays are safe; conflict policy when two suppliers describe the same property (source-of-truth ranking).
- Snapshot vs delta reconciliation: periodic full-sync to repair drift; blue/green or shadow tables so serving never reads a half-loaded feed.
asked …