Hacker News new | ask | show | jobs
by z8000 6004 days ago
Thank you for explaining how compaction works in greater detail. As I understand it, the "buffer" used for write queries (and more) while compaction is active is in-memory, correct, an ArrayList? http://github.com/mmcgrana/fleetdb/blob/master/src/clj/fleet...

Why not just open a new file at compaction-start instead of an in-memory buffer? When compaction ends, append the newly open file to the compacted file, then swap-in the compacted file as the current log file.

I suppose deciding on whether to buffer in memory or on disk would depend on several factors:

1) how much compaction is required and thus how long compaction might take to complete

2) historical write-rate average

3) buffer size threshold

4) compaction time threshold

By thresholding I mean: start buffering in memory and then switch to a file on disk if compaction starts taking "too long" to complete or the buffer in memory becomes "too large".