Hacker News new | ask | show | jobs
by NovaX 476 days ago
This was a good attempt but flawed.

1. You should use JMH to handle warmup, jit, timings, averaging of runs, etc. You might enjoy the following talks on the subject, though I am sure there are other gems out there.

- Performance Anxiety: https://wiki.jvmlangsummit.com/images/1/1d/PerformanceAnxiet...

- Benchmarking for Good: https://www.youtube.com/watch?v=SKPdqgD1I2U

- (The Art of) (Java) Benchmarking: https://shipilev.net/talks/j1-Oct2011-21682-benchmarking.pdf

- A Crash Course in Modern Hardware: https://www.youtube.com/watch?v=OFgxAFdxYAQ

- How NOT to Write a Microbenchmark https://www.slideshare.net/slideshow/2002-microbenchmarks/28...

- Anatomy of a flawed microbenchmark https://web.archive.org/web/20110513090823/http://www.ibm.co...

2. An uncontended lock is basically free, as you noticed.

3. Guava does implement the Map interface via asMap(). It also have a concurrencyLevel to adjust its performance. It is based on Java 5's CHM.

4. Caffeine is a Guava Cache rewrite based on Java 8's CHM rewrite, plus many learnings. You might look at its benchmarks for ideas: https://github.com/ben-manes/caffeine/wiki/Benchmarks

5. Data structures are surprisingly tricky. For example see this analysis showing an accidental misunderstanding degrading an LRU to O(n) eviction. https://gist.github.com/ben-manes/6312727adfa2235cb7c5e25cae...

6. It is important to remember that the goal of a benchmark is never which is faster or by how much. It is to ask (a) is it fast enough? (b) might I reach a point where it will not be? and (c) does it degrade unexpectedly? == This is to say the winner is of little interest, once all the choices are good enough then it is about usability, features, documentation, friendliness, and so on.

1 comments

I completely agree with point 6, obviously these microbenchmarks are not representative of much of anything, I was just trying to check my assumptions on some of this stuff.

I actually knew about the Guava asMap(), but I wasn't sure if that introduced overhead and I thought that might be an unfair test.

I have heard of Caffeine, but I haven't used it yet. Maybe I'll write an updated post talking about it.

I'll look into JMH and make a part 2 of this post tomorrow.

In both the asMap() view unwraps the opinionated Cache facade to the underlying bounded map.

There are a lot of little gotchas so it is best to not believe your own results until proven otherwise. A Rust influencer wrote EVMap, eagerly giving a talk about its performance equaling concurrent maps while being much simpler. However since he generated the random numbers as part of the test loop, he did not realize that it uses a lock to compute the next seed. This throttled the benchmark to make the faster maps slower as they created more contention, fully invalidating and inverting the results. Sadly since this was presented as part of a PhD such truths are of little importance and the false claims continue to be shown. That's just to share how innocent mistakes are the norm, but incentives to not correct them are doubly so, and that you should never trust your own results until you've exhausted all attempts to disprove them as invalid.

Sounds like you have the right perspective and a great attitude. Feel welcome to ping me if you run into any troubles or questions as I collab'd on Guava's cache and wrote Caffeine.

I actually just got JMH set up in Gradle, and I'm rewriting my tests to try and give more accurate results. Once I have numbers from JMH I'll write another post, and I'll put a caveat to take them with a grain of salt.

In this specific example, I was actually afraid that randomness taking indeterminate time could be an issue, which is why I pre-allocated all the UUIDs at the beginning, and used the same list as input data for each one; I figured if all the setup was allocated before I started the timer, and every test has the same input, then we can factor out any inefficiencies since that should affect all of them roughly equally.

Anyway, I have wanted to learn JMH for awhile, so this is the push I needed to actually do it.