|
|
|
|
|
by stingraycharles
1644 days ago
|
|
Even when memory bandwidth is the bottleneck (which I’m not sure about), optimizations help. Due to the nature of CPUs operating asynchronously, if one function is waiting for memory to be read, it can continue doing other things. As such, if this base64 implementation is more efficient, even though the “wall clock” time is exactly identical due to the memory bandwidth, the CPU has more time to perform _other_ tasks. |
|
It would be cool to show a demo of a processor maintaining good IPC on one hyper thread while the other hyper thread running on the same core was able to do a base64 decode.