| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by thesz 3116 days ago

Most of current storage backends have Log-structured Merge Tree implementation or something like that.

The larger layers of LSMT have enormous size and should be accessed/built as rare as possible.

Being able to predict that given element exists in the larger layers at all is quite a bonus. You can skip reading megabytes of data.

The rareness of building of the larger layers justifies training deep neural model for them.

I cannot verify existence of LSMT backend for major SQL DB engines, but NoSQL engines use it a plenty: https://en.wikipedia.org/wiki/Log-structured_merge-tree

2 comments

jmcminis 3116 days ago

So you want a LSM for inserts and the DNN for reads? Seems OK. You still have to update/retrain the DNN after an insert into a larger layer, which will be expensive. So you’d probably get high latency at the 99% (or some high number).

link

thesz 3116 days ago

There are no inserts into larger layers, only merges. Which are long (usually processed in background by separate thread) and that longness justifies training a new net in parallel to merge process.

link

wolf550e 3116 days ago

Rocksdb is lstm for an SQL engine

link