Hacker News new | ask | show | jobs
by louthy 908 days ago
> a high performance btree

then …

> the kernel seemed to struggle …

What was the struggle? If it’s performance doesn’t that contradict your earlier statement?

Genuinely interested in what the issue was, not trying to be a pedant

2 comments

> What was the struggle?

Performance is always great until you have to hit disk. Not uncommon to rely on mmap at which point your disk access is sub-optimal vs. a hand-tailored buffer manager with strategies to improve disk reads.

This requires the obligatory https://db.cs.cmu.edu/mmap-cidr2022/
The linux kernel lets you trigger asynchronous writes of the pages as well as synchronous barriers to ensure they've been written.

You don't need to use direct I/O to have fine control.

the purpose of a btree is to optimize when you are hitting the disk, you can't call that the struggle, that's when the btree sings (tho you could consider extensible hashing)
My throughput was significantly higher than sqlite (4x or so, if memory serves), but the kernel spent so much time swapping pages that the mouse cursor stuttered.

A custom page manager would have probably done the trick, but I don't have the technical chops to write one.