Hacker News new | ask | show | jobs
by jstimpfle 3351 days ago
Strings are length 10 to 100. A cache line is 64 bytes. A cache miss is < 1000 cycles < 10^3ns. That makes < 1s for 10^6 strings.

Furthermore adjacent strings are likely to be adjacent in memory, from which I would naively expect cache prefetching to succeed.

Additionally, the curve is quadratic (if at all), not linear. I bet the author looked up all n strings from the vector of n strings.

1 comments

I think the curves are linear because the x-axis is log scale but the y-axis is not.
Gotcha. It's probably still quadratic, just judging from the values. You can't need 32 minutes for 10^6 100-byte comparisons (which typically abort after the first character compares unequal. Not that that matters).
It's definitely a strange set of graphs:

Mixed (log/linear) scales on the axes

Only two columns of points on each graph matter (the last two) because the left hand side is all zeros (an artifact of the y axis not being log scale)

The y-axis is sometimes in thousands-of-mega-nanoseconds (with two significant digits) or sometimes in giga-nanoseconds (with one).

Giga-nanoseconds are seconds!

The colors for each line sometimes changes between graphs.