Hacker News new | ask | show | jobs
by roanakb 661 days ago
Agreed, roofline plots would be quite powerful in this context. From a quick search, seems like the only way to create a roofline plot for your model would be to use Nsight [1]? Would be interested to know if there are any simpler tools, since one of the big benefits of SM efficiency is how easily the metric is accessed.

[1]: https://www.nvidia.com/en-us/on-demand/session/gtcspring21-s...

1 comments

Depending on the size of your application you can calculate flops by hand

https://docs.nersc.gov/tools/performance/roofline/