|
|
|
|
|
by rescrv
4768 days ago
|
|
I'm the HyperDex developer who did the work on HyperLevelDB. I'll answer your questions in order as best I can. 1. Our internal synchronization mechanism is a simple change. The stock LevelDB does the following: place our current on the back of wait_queue
wait for (our thread to be head of wait_queue || thread ahead of us to do our work)
if work done: exit
possibly build a batch of our writes and
append data to the log
insert data into the memtable
signal the next writer, and any writer whose work we finished
HyperLevelDB does this a little differently. We made the log and the memtable concurrent datastructures, so that multiple threads can write to each one at a time. We then do a little synchronization to ensure that we don't reveal the writes to readers in the wrong order. get a ticket, indicating the order of our writes
insert the data into the log
insert the data into the memtable
wait for writes with a lower token to complete
For the actual implementations, check out the code for LevelDB (Lines 1135-1196 of https://github.com/rescrv/HyperLevelDB/blob/28dad918f2ffb80f...) and HyperLevelDB (Line 1307-1428 of https://github.com/rescrv/HyperLevelDB/blob/master/db/db_imp...).Effectively, this change moves from a model where there is exactly one writer at a time, to one where the bulk of the work (inserting into log/memtable) is done in parallel by writer threads. 2. LevelDB provides a GetProperty call. We can inspect the number of files in Level-0 and back-off where appropriate. There is no write delay in LevelDB itself. By the end-to-end principle, the storage server is in a better position to decide whether to delay writes, or just keep pushing them into the database. |
|