Hacker News new | ask | show | jobs
by cmallen 5972 days ago
Am I the only one wondering why facebook hasn't implemented a compression backend into memcache much like Reiser4 and ZFS has done?

They've made it very clear that they're RAM limited (in particular with respect to capacity), so why not just have the processor compress/decompress memcache operations back and forth with a highly efficient and relatively low compression algorithm?

It's not even like you couldn't tune the algorithm to detect duplicate/similar data and create atomic globs of data that represent multiple informational objects.

It seems like their big cost is putting together machines with tons of RAM for their memcache clusters, so why not bring that cost down?

4 comments

http://highscalability.com/blog/2009/10/26/facebooks-memcach... would indicate that Facebook's memcached instances are sometimes CPU-bound. They've submitted several patches to Memcached to improve its performance so that would back that up as well.

I've wondered the same thing--compression would be enormously helpful to us since we're RAM-bound (even with tons of RAM) and store a lot of easily compressible HTML. Further, our memcached instances show almost no CPU load.

Well, I guess for their scale, CPU-limiting factors would make compression not worthwhile.

That said, for more common scaling issues, I think compression would be a huge win, especially with Redis-style backends.

Thanks Stephen, great info and highly relevant the scale issues.
memcache clients already support client-side compression, which compresses the data before it goes over the network. It wouldn't make sense to move that to the server.

I'm quite sure they're already doing compression.

It would certainly make sense to do compression while it's stored server-side in RAM if they're RAM capacity limited.
In your suggestion, are fragments of the memory compressed separately, then decompressed on the fly for use?

I thought large scale distributed servers were more bandwidth limited based on this paper (http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keyn...)

no they said the opposite that they are surprisingly cpu-bound.