|
|
|
|
|
by nickysielicki
438 days ago
|
|
I don’t mean to call you or your pseudocode out specifically, but I see this sort of thing all the time, and I just want to put it out there: PSA: if you ever see code trying to measure timing and it’s not using the CUDA event APIs, it’s fundamentally wrong and is lying to you. The simplest way to be sure you’re not measuring noise is to just ban the usage of any other timing source. Definitely don’t add unnecessary syncs just so that you can add a timing tap. https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART_... |
|
If I don’t care what part of the CUDA ecosystem is taking time (from my point of view it is a black-box that does GEMMs) so why not measure “time until my normal code is running again?”