| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by martincmartin 3359 days ago
	The x-axis is the number of iterations, i.e. the number of lookups. The collection is always size 1,000,000. Also, each of those lookups is for the exact same entry, so after the first time, it'll be all cache hits. Well, except for the std::vector, which is what you're talking about... :)

1 comments

jstimpfle 3359 days ago

Strings are length 10 to 100. A cache line is 64 bytes. A cache miss is < 1000 cycles < 10^3ns. That makes < 1s for 10^6 strings.

Furthermore adjacent strings are likely to be adjacent in memory, from which I would naively expect cache prefetching to succeed.

Additionally, the curve is quadratic (if at all), not linear. I bet the author looked up all n strings from the vector of n strings.

link

aarongolliver 3359 days ago

I think the curves are linear because the x-axis is log scale but the y-axis is not.

link

jstimpfle 3359 days ago

Gotcha. It's probably still quadratic, just judging from the values. You can't need 32 minutes for 10^6 100-byte comparisons (which typically abort after the first character compares unequal. Not that that matters).

link

aarongolliver 3358 days ago

It's definitely a strange set of graphs:

Mixed (log/linear) scales on the axes

Only two columns of points on each graph matter (the last two) because the left hand side is all zeros (an artifact of the y axis not being log scale)

The y-axis is sometimes in thousands-of-mega-nanoseconds (with two significant digits) or sometimes in giga-nanoseconds (with one).

Giga-nanoseconds are seconds!

The colors for each line sometimes changes between graphs.

link