Tiny Pointers Disprove 40-Year-Old Conjecture on Hash Tables

A Rutgers University undergraduate, Andrew Krapivin, stumbled upon a paper that would change his life. Two years later, he reworked the data in the paper to create a new kind of hash table that worked faster than expected. Researchers Martín Farach-Colton and William Kuszmaul verified Krapivin’s discovery, revealing that it disproved a 40-year-old conjecture on hash tables.

Hash tables are a fundamental data structure in computer science, allowing users to query, delete, or insert elements efficiently. They were first developed in the early 1950s and have been extensively studied since then. Researchers sought to determine the speed limits for certain operations, such as searching or inserting elements.

The previous fastest known method required finding an empty slot, which took time proportional to x. However, Andrew Yao’s conjecture from 1985 stated that this limit could never be beaten. Krapivin broke this barrier by developing a new hash table design that achieved the optimal time complexity of (log x)2 for worst-case queries and insertions.

This discovery also revealed an unexpected result: non-greedy hash tables can achieve a constant average query time, regardless of their fullness. This finding was considered surprising even to the authors themselves.

While the results may not have immediate practical applications, they demonstrate the importance of understanding data structures better. As Guy Blelloch said, “It’s beautiful in that it addresses and solves such a classic problem.” Sepehr Assadi noted that this result could have unlocked better performance in practice 40 years ago.

Source: https://www.wired.com/story/undergraduate-upends-a-40-year-old-data-science-conjecture