| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by maypok86 908 days ago

In fact, to say that sieve is better is fundamentally wrong.

1. You only test the hit ratio on a synthetic zipf distribution and claim an advantage without saying a word about the problems (which there are).

2. Checking implementation speed looks very questionable on such benchmarks, because you spend a huge amount of CPU time on synchronization and item generation.

3. Your implementation has at least three problems that otter doesn't: worst case O(n) time complexity of the eviction policy, disgusting scalability, also golang-fifo simply doesn't have as many features as otter, and adding each of them will only slow golang-fifo down even more.

4. Yes, when increasing contention otter sacrifices 1-2 percent (I'll have to see if I can reduce this loss) hit ratio to maintain response speed, but that's what caffeine, ristretto and other famous cache libraries do, because in this case it's more important to maintain scalability than to keep hundreds of goroutines waiting for a mutex to unlock.

5- This structure looks highly questionable to support ghost queue: https://github.com/scalalang2/golang-fifo/blob/main/s3fifo/b.... You are wasting a very large number of bytes just to maintain an item footprint.

1 comments

scalalang 908 days ago

Hello, Thank you for replying here :)

Many of answers you replied are reasonable and good.

---------

And I just want to add more comments for others.

1. SIEVE is not scan-resistant, so that, I think it should only be applied for web cache workloads (typlically follows power-law distribution)

2. SIEVE is somewhat scalalbe for read-intensive applications (e.g. blog, shop and etc), because it doesn't require to hold a lock on cahce hit.

3. The purpose of golang-fifo is to provide simple and efficient cache implementation (e.g. hashicorp-lru, groupcache).

-> I don't want to beat high-performant, bla-bla, specialized in-memory caches (bigcache, fastcache and etc).

-> Both S3-FIFO and SEIVE have O(n) worst-case cache eviction. and it only happens when all of objects in the cache are hit, which means, a perfect(100%) hit rate in sequences.

4. when increasing contention otter sacrifices 1-2 percent

-> I think that the statement is not enough. The hit rate varies depending on the total number of objects and the size of the cache, so it should be compared relatively. for example, otter's efficiency decreased by 5% compared to single-threaded when lock contention increased (decreased efficiency makes a mean network latency higher, because it may need to conduct heavy operation e.g. re-calculation, database access and so on)

-> Therefore, in certain situations, the efficiency of a cache library may be more important than its QPS. This is not a matter of which one is better, but rather that other people should be able to choose a cache library that suits the specific needs of their application.

5. ghost queue : honetly at that time of writing the code, I didn't deep dive into the bucket table implementation, it may not work same as actual bucket hash table (see here: https://github.com/scalalang2/golang-fifo/issues/16)

---

If anyone here want to raise issues or give contributions for golang-fifo, please visit here https://github.com/scalalang2/golang-fifo

maypok86 908 days ago

1. That's not the problem, yes, most web traces aim for zipf distribution, but you can't say it will work well in real life just based on this check (this was also mentioned in the S3-FIFO discussion here). Because there are many ways to break this best case.

2. There is a small problem here, yes, on 100% reads workload this will show good results, but there is a catch here, you should add even one percent of writes and this cache degrades very badly.

3. This makes perfect sense! But the O(n) worst case is a bit scarier than you describe :(. There it is quite enough to have just a large enough sequence of elements in the main queue with a frequency greater than zero (100000 for example will suffice), in this case you will just reinsert all these elements with global locking, although hashicorp-lru or groupcache will not do this, but just move one element.

4. it's not quite true, losses depend on contention and nothing else, and in your benchmarks you make otter withstand 4 million qps and at such a load it lost only 2% (probably this value can be reduced a lot, I'll have to look at it at the holidays), which at such a load hardly matters much.

5. My point is that you store at least 2 * key size + 8*2 + 8 bytes just to store the fingerprint. And if you also count the pointers that are held for the sake of it, it becomes quite sad.