EExpedia Group·DSAL3Technical Interview 1

Hotels with Most Preferred Words in Reviews

viaLeetCode

Problem Given a list of preferred words and hotels each with review texts, return the hotel id(s) whose reviews contain the most occurrences of preferred words.

Input / Output

Input: string[] preferredWords; list of (hotelId, review text) pairs (a hotel may have many reviews).
Output: hotel id(s) with the maximum preferred-word count (clarify tie handling — often smallest id or all ids).

Constraints

Total review text up to ~10^6 chars; matching must be whole-word and case-insensitive (clarify punctuation stripping).

Example

preferred = ["clean","quiet"], reviews: h1 "Clean room, very quiet.", h2 "quiet street" → h1 (2 vs 1).

Expected approach

Put preferred words (lowercased) in a hash set; per review, tokenize on non-letters, lowercase, count tokens present in the set, accumulating per hotel in a map; take the max. O(total tokens) time. Talking points: word-boundary correctness ("cleanliness" shouldn't match "clean" — that's why substring contains() is wrong), stemming as an extension, and top-k via heap if many hotels are requested ranked.

Add a follow-up question they asked

No follow-ups yet. Be the first to add one.

asked …