Design a String Hash Function with Minimum Collisions
viaLeetCode
Prompt Design a hash function for strings that minimizes collisions, and justify its distribution properties.
Be ready to discuss
- Polynomial rolling hash: h(s) = (s[0]*p^(n-1) + … + s[n-1]) mod m; computing it incrementally in O(n).
- Choice of base p (larger than the alphabet, e.g. 31/53 for letters) and modulus m (a large prime, e.g. 10^9+7, or 2^61−1) — why primes reduce systematic collisions.
- What makes a hash "good": uniformity over buckets, avalanche behaviour, cheap evaluation; contrast with additive hashing (sum of chars) and why it clusters ("listen"/"silent" collide).
- Birthday bound: expected collisions grow ~n^2/2m — how to size m or use double hashing (two independent mods) to make collision probability negligible.
- Real-world references: Java's String.hashCode (base 31), FNV, MurmurHash; separate chaining vs open addressing when collisions do happen; adversarial inputs and randomized bases (hash flooding).
asked …