Data, even if well organized still requires you to page through a lot.
An index helps you quickly jump to specific data you might be interested in.
A hash function $h(k)$ is ...
Modulus $h(k)\%N$ gives you a random number in $[0, N)$
Idea: Resize the structure as needed
To keep things simple, let's use $$h(k) = k$$
(you wouldn't actually do this in practice)
"The Case for Learned Index Structures"
by Kraska, Beutel, Chi, Dean, Polyzotis
$f(key) \mapsto position$
(not exactly true, but close enough for today)
Simplified Use Case: Static data with "infinite" prep time.
We have infinite prep time, so fit a (tiny) neural network to the CDF.
if
statements are really expensive on modern processors.Next Class: Using Indexes