|
|
|
|
|
by jules
1261 days ago
|
|
It's not clear that even a super smart compiler can do this. The best schedule depends on the latency of instructions. This is a problem because we can't know statically whether a particular memory load is in L1/L2/L3/DRAM/etc., as this can vary for different executions of the same load instruction. |
|
[1] https://doi.org/10.1145/2451116.2451143