Hacker News new | ask | show | jobs
by scalalang 909 days ago
Hello, Thank you for replying here :)

Many of answers you replied are reasonable and good.

---------

And I just want to add more comments for others.

1. SIEVE is not scan-resistant, so that, I think it should only be applied for web cache workloads (typlically follows power-law distribution)

2. SIEVE is somewhat scalalbe for read-intensive applications (e.g. blog, shop and etc), because it doesn't require to hold a lock on cahce hit.

3. The purpose of golang-fifo is to provide simple and efficient cache implementation (e.g. hashicorp-lru, groupcache).

-> I don't want to beat high-performant, bla-bla, specialized in-memory caches (bigcache, fastcache and etc).

-> Both S3-FIFO and SEIVE have O(n) worst-case cache eviction. and it only happens when all of objects in the cache are hit, which means, a perfect(100%) hit rate in sequences.

4. when increasing contention otter sacrifices 1-2 percent

-> I think that the statement is not enough. The hit rate varies depending on the total number of objects and the size of the cache, so it should be compared relatively. for example, otter's efficiency decreased by 5% compared to single-threaded when lock contention increased (decreased efficiency makes a mean network latency higher, because it may need to conduct heavy operation e.g. re-calculation, database access and so on)

-> Therefore, in certain situations, the efficiency of a cache library may be more important than its QPS. This is not a matter of which one is better, but rather that other people should be able to choose a cache library that suits the specific needs of their application.

5. ghost queue : honetly at that time of writing the code, I didn't deep dive into the bucket table implementation, it may not work same as actual bucket hash table (see here: https://github.com/scalalang2/golang-fifo/issues/16)

---

If anyone here want to raise issues or give contributions for golang-fifo, please visit here https://github.com/scalalang2/golang-fifo

1 comments

1. That's not the problem, yes, most web traces aim for zipf distribution, but you can't say it will work well in real life just based on this check (this was also mentioned in the S3-FIFO discussion here). Because there are many ways to break this best case.

2. There is a small problem here, yes, on 100% reads workload this will show good results, but there is a catch here, you should add even one percent of writes and this cache degrades very badly.

3. This makes perfect sense! But the O(n) worst case is a bit scarier than you describe :(. There it is quite enough to have just a large enough sequence of elements in the main queue with a frequency greater than zero (100000 for example will suffice), in this case you will just reinsert all these elements with global locking, although hashicorp-lru or groupcache will not do this, but just move one element.

4. it's not quite true, losses depend on contention and nothing else, and in your benchmarks you make otter withstand 4 million qps and at such a load it lost only 2% (probably this value can be reduced a lot, I'll have to look at it at the holidays), which at such a load hardly matters much.

5. My point is that you store at least 2 * key size + 8*2 + 8 bytes just to store the fingerprint. And if you also count the pointers that are held for the sake of it, it becomes quite sad.