Hacker News new | ask | show | jobs
by gleb 5444 days ago
The synchronous writes benchmark is interesting. This is normally bound by # seeks your disk can do per second, which is mostly a function of rotational speed. With 7200RPM drive you get 7200/60 = 120 of these a second. So the 100 and 110 numbers for competitors make sense. 2,400 for LevelDB does not.

Is LevelDB batching writes or is there something more interesting going on?

4 comments

If you are writing sequentially, then you can write more than the number of seeks.

And that is exactly what LevelDB is doing: writing a log (sequential), and when the memorychunk is full, it is writing it to disk sorted (this is also sequential).

flushing the log in an LSM is only kinda sequential, sadly
Data structures which require a disk seek per random insert are obsolete. LevelDB is using a Log-Structured Merge Tree, one of many write-optimized data structures (but not the best).
This link, comparing LSM trees with fractal trees, is quite interesting: http://www.quora.com/What-are-the-major-differences-between-...
Is LevelDB batching writes

Yes, updates can be done in one atomic batch. Please correct me if I'm wrong, but I don't think Tokyo Cabinet allows it without Tokyo Tyrant.

If you write full disk blocks, wouldn't the disk cache hide the seek latency?
Having write disk cache on would certainly explain it. But that leaves the question of discrepancy with numbers with competitors.

You turn off write-through caching on disks when you run a database unless you are willing to accept corruption (which is worse than data loss) on power outage. And that's why you can't get acceptable write performance out of database without a battery-backed RAID controller (or something other kind of RAM-based write cache with a battery backup).

Here's a simple way to test # fsyncs/s (a.k.a. commit rate) on your system:

  sysbench --test=fileio --file-fsync-freq=1 --file-num=1 \
   --file-total-size=16384 --file-test-mode=rndwr run --max-time=10 \
   | grep "Requests/sec"
If cache is on, any performance discrepancy can be explained away by "usage patterns" :)

Also, do you really mean turn off write-through, or did you mean write-behind? (I can't see how write-through would cause corruption, but maybe I'm missing something...)

Also, I wouldn't be surprised if there's a discrepancy in the flushing code across systems. God knows flushing a file to disk in cross-platform code is an arcane science :)

And finally, as somebody else pointed out, LevelDB seems to order write access sequentially as much as possible.