| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sspiff 3084 days ago
	How does this work for two instructions in the pipeline at the same time that refer to the same cache line? If the second instruction executes the read phase before the first is retired/committed to cache, you would be hit by two memory fetch latencies, significantly hurting performance. I guess compilers could pad that out with noops to postpone the read until the previous commit is done if they know the design of the pipeline they are targetting. But generically optimized code would take a terrible hit from this.

1 comments

thisoneforwork 3084 days ago

Firstly, thanks for the question. As mentioned, not a CPU designer or trying to teach Intel what to do. More like relying on the hive mind to see if I have the right idea.

A second instruction in the pipeline would read from the above mentioned L0 cache (let us call it load buffer), much like it would for tentative memory stores from the store buffer.

Also, two memory fetches in parallel are not twice as long as a memory fetch, if that would be the solution (which I guess would not be the case, as I imagine race conditions appearing)

link

foota 3084 days ago

I don't think you can allow two speculatively executing instructions to read from the same L0 cache.

For example say the memory address you want to look for being cached is either 0x100 or 0x200 (not realistic addresses but it works for example) based on some kernel memory bit. Then run instructions in userspace that try to fetch 0x100 (with flushes in between). If you notice one that completes quickly, then it must have used the value 0x100 cached in L0 cache by the kernel? (and also run over 0x200 to try and check when it's cached in L0)

link

nialv7 3084 days ago

L0 is only used by speculatively executed uOPs, before they are actually committed. Therefore anything that reads from L0 has to be speculatively executed too.

So if the uOP populated the L0 was reading from kernel memory, then it won't be committed. And subsequent uOP read from the L0 won't be committed either. So you can't get timing information from them.

link

foota 3083 days ago

But if another instruction reads from the same cache then that one could retire.

link