| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by frankmcsherry 2783 days ago

> Something I'd like to see is benchmarks of batch-query performance: If we can queue a couple of queries, then trees should gain a lot from processing them in-order (then, the last query has already warmed the cache for the next; this probably can pay for approximately sorting the batch).

I'm not sure if this is your point, but when you actually look at the uses of trees like the above (often: query processing) then the point look-up times are less relevant for implementations that do not serialize the look-ups.

Another place this shows up is when folks switch from closed-loop testing to open-loop testing (i.e. testing under a fixed rate of look-ups, rather than one or some few at a time). The latency-throughput trade-off curve makes trees look a lot less bad, but they still have a large minimal latency. A hash map with a sorted key (e.g. RobinHood hashing) still works better except at peak loads.

You can take this just a bit further and you find that replacing trees with a sorted list also works great (or, LSM of sorted lists, say). "Great" here just means "has some tolerable minimum latency, and peak throughput".