2dbi

Design a String Hash Function with Minimum Collisions

viaLeetCode

Prompt Design a hash function for strings that minimizes collisions, and justify its distribution properties.

Be ready to discuss

  • Polynomial rolling hash: h(s) = (s[0]*p^(n-1) + … + s[n-1]) mod m; computing it incrementally in O(n).
  • Choice of base p (larger than the alphabet, e.g. 31/53 for letters) and modulus m (a large prime, e.g. 10^9+7, or 2^61−1) — why primes reduce systematic collisions.
  • What makes a hash "good": uniformity over buckets, avalanche behaviour, cheap evaluation; contrast with additive hashing (sum of chars) and why it clusters ("listen"/"silent" collide).
  • Birthday bound: expected collisions grow ~n^2/2m — how to size m or use double hashing (two independent mods) to make collision probability negligible.
  • Real-world references: Java's String.hashCode (base 31), FNV, MurmurHash; separate chaining vs open addressing when collisions do happen; adversarial inputs and randomized bases (hash flooding).
Add a follow-up question they asked
No follow-ups yet. Be the first to add one.
asked …
LeaderboardSalary
Language
Account