Hacker News new | ask | show | jobs
by power 3109 days ago
Normal DB indexes are designed to be general-purpose so that they give predictable performance no matter the data you insert. The paper basically describes a more efficient way to look up data by using an index that's tailored to the specific dataset it's built over. It's expected that you'd be able to get a performance increase by doing this. In more detail, they use neural nets to learn an approximation of the distribution of the data and use that to look up the rough position of each key faster than usual. They still use traditional structures for the "last-mile" which is not so easily learned. You could accomplish the same thing without NNs by using anything that can approximate the distribution of the data. E.g. a histogram would work for some cases, and you could do some PCA and normalization first to deal with more cases. NNs have the advantage that they can learn more complex distributions.
1 comments

The other advantage is more opportunity to parallize the work. So with new silicon architectre like the Google TPUs. This part is an important element to the equation.