Hacker News new | ask | show | jobs
by jmcminis 3116 days ago
As it says in the paper, this might be useful for data warehouses. But, it’s not coming to postgres anytime soon. Index updates on the order of seconds to minutes would be too much for a transactional db.

There is also the cold start problem. How do you start to lay out the data on disk as you begin inserting it? Do you have a pre-trained net and use it at first (inserting where the net thinks the data should be)? The strategy probably differs by index type.

3 comments

Most of current storage backends have Log-structured Merge Tree implementation or something like that.

The larger layers of LSMT have enormous size and should be accessed/built as rare as possible.

Being able to predict that given element exists in the larger layers at all is quite a bonus. You can skip reading megabytes of data.

The rareness of building of the larger layers justifies training deep neural model for them.

I cannot verify existence of LSMT backend for major SQL DB engines, but NoSQL engines use it a plenty: https://en.wikipedia.org/wiki/Log-structured_merge-tree

So you want a LSM for inserts and the DNN for reads? Seems OK. You still have to update/retrain the DNN after an insert into a larger layer, which will be expensive. So you’d probably get high latency at the 99% (or some high number).
There are no inserts into larger layers, only merges. Which are long (usually processed in background by separate thread) and that longness justifies training a new net in parallel to merge process.
Rocksdb is lstm for an SQL engine
Well, Postgres does already have something pretty close in Dexter. Made possible due to the Hypothetical Indexes extension. Dexter can either automatically create indexes using concurrent index creation or it can build a list that you can load at your convenience.

I'm just waiting for it to make it into one of the big PG providers.

https://medium.com/@ankane/introducing-dexter-the-automatic-...

> it’s not coming to postgres anytime soon

Isn't Peloton close? http://pelotondb.io/