| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by solarexplorer 4676 days ago
	This is actually what hyperthreading is all about: cache misses. I missed that in the article. There are more things missing actually, but I guess it would be too much to explain it all in a single article. Things like caches, coherence protocols, prefetching, memory disambiguation. Registers are also much more complex because you have things like register renaming, result forwarding etc. In the end there are simply much less registers than memory locations, that's why you can build faster registers than memory.

1 comments

mikeash 4676 days ago

I thought hyperthreading was able to go beyond this, and e.g. execute the two streams in parallel if one is hitting the FPU and the other is doing integer work, even if neither one is stalled.

And you're right, it's missing a lot because I'm writing an article, not a book. It is fun to explore details, but ultimately you have to stop somewhere.

link

mistercow 4676 days ago

That was the impression I had too, but if so I can see how "this is actually what hyperthreading is all about" would make sense. Two streams of code are unlikely to have long segments of just-FPU and just-integer respectively, and even more unlikely that those streams will happen to align during execution. It happens, sure, but the gains would be smallish.

On the other hand, long periods of no cache misses followed by long periods of waiting after a cache miss are exactly what you expect from real code (especially optimized code). So I'd think that you'd have much bigger gains from that. The same goes for branch misprediction.

link

mikeash 4676 days ago

Well, the gains are smallish. Real-world gains from hyperthreading are on the order of 10-20% when you load up a CPU with two threads.

link

mistercow 4676 days ago

Yeah, but when I said "smallish" I was thinking more on the order of 1%. I would consider 10% actual gains to be quite large given the craziness of what Hyperthreading tries to accomplish.

link

mikeash 4676 days ago

It may also be a matter of more fully utilizing multiple integer/floating-point units. Say, if the CPU has two integer units but the current code is only using up one of them, then it could run the second hyperthread on the other. I really don't know the details though.

link

solarexplorer 4676 days ago

Yes, hyperthreading (aka SMT), as implemented in Intels processors, can execute instructions from several threads in the same clock cycle. Other processors, like Sun's Niagara, switch threads only on certain events like cache misses (this is known as SoEMT). Workloads with a lot of cache misses is where both really shine.

Of course it's hard to write about a complex topic, choose the right details, and make it all seem simple. Thumbs up for trying!

link