Hacker News new | ask | show | jobs
by tayo42 1705 days ago
Not sure if there's a better place to ask? But ill just try here. Curious about a design decision in the extstore. It seems to include a lot of extra stuff around managing writes and whats in memory and whats on disk. Why do you think this is better then just mmap-ing and letting the OS decide whats in memory using the fs cache and what pages are still disk?
1 comments

That's an excellent question; it turns out there are a _lot_ of semantics that the OS is covering up for you when using mmap. For instance (this may be fixed by now), but any process doing certain mmap syscalls locked access to any open mmap's in an OS. So some random cronjob firing could clock your mmap'ed app pretty solidly.

There are also wild bugs; if you google my threads on the LKML you'll find me trying to hunt down a few in the past.

Mainly what I'm doing with extstore is maintaining a clear line between what I want the OS doing and what I want the app doing: a hard rule that the memcached worker threads _cannot_ be blocked for any reason. When they submit work to extstore, they submit to background threads then return to dequeueing network traffic. If the flash disk hiccups for any reason it means some queue's can bloat but other ops may still succeed.

Further, by controlling when we defrag or drop pages we can be more careful with where writes to flash happen.

TLDR: for predictable performance. Extstore is also a lot simpler than it may sound; it's a handful of short functions built on a lot of design decisions instead of a lot of code building up an algorithm.

Interesting, makes sense. Thanks for the response! ill have to take a closer look at the code.
Everything that interacts with the disk is extstore.c, most of the wrapper code that glues memcached with extstore is storage.c. extstore.c has barely changed since I first wrote it; so there's not much maintenance overhead vs mmap anyway.
oh i see, nice, last commit was almost a year ago heh