|
|
|
|
|
by pcordes
3685 days ago
|
|
That's what I found when writing my answer on SO. The only justification I could come up with for using memory was to generate all the random numbers first, then use them later. I did include several potential microarchitectural slowdowns, like causing a store-forwarding stall by modifying just the high byte of `double` with an XOR to flip the sign bit. That isn't much of a stretch, but will hurt a lot Multi-threading with a shared atomic loop counter is also really good, and maybe worse than you'd naively expect (i.e. easy to justify re: diabolically incompetent). |
|