|
|
|
|
|
by dpezely
2717 days ago
|
|
How does your workload and server spec compare to theirs? Following links to the OSDI'18 paper [1] gives this overview: "Setup. In all experiments, Noria and other storage backends run on an Amazon EC2 c5.4xlarge instance with 16 vCPUs; clients run on separate c5.4xlarge instances unless stated otherwise." That paragraph goes on to explain the nature of the workload. Their Figure 2 gives example SQL commands. [1] https://jon.tsp.io/papers/osdi18-noria.pdf |
|
His read throughput is linear to the number of cores, which means that it must have been run with the same number of cores to threads. Otherwise context switching and sharing of cores would have resulted in sublinear growth (you can get superlinear growth, but it won't look like that). Assuming his numbers are not fake, his github charts indicate a 32+ core machine.
My 2015 benchmarks were on a Azure G4, Xeon E5-2698B v3 @ 2.00GHz (16 core, hyperthreading disabled), 224 GB, Ubuntu 15.04. They show a slight superlinear growth for reads (Caffeine). https://github.com/ben-manes/caffeine/wiki/Benchmarks#server...
Both benchmarks use a Zipf distribution. I believe he said it only supports single-writer semantics, whereas mine is roughly per-entry (per hashbin). So reads should be safe to compare, and we can ignore his slow write rate. His should be faster as I do more work to maintain complex eviction policies, whereas his can reading a pseudo-immutable map with no eviction.
Since he does not perform any cache operations, as it is a concurrent map, we should be comparing against another unbounded concurrent map. That is over 1 billion reads per second on the above hardware. That is why I think something is miserably wrong or I'm misunderstanding something fundamental. Achieve 25M/s is a very disappointing result.