|
|
|
|
|
by antognini
659 days ago
|
|
When understanding the performance of your model it's very helpful to look at a roofline plot [1]. The roofline plot will show you the floating-point performance as a function of arithmetic intensity for the various ops in your model. The plot has two regimes: a memory-bound regime on the left and a compute-bound regime on the right. This can help to identify memory-bound ops that are taking a significant fraction of compute time. [1]: https://en.wikipedia.org/wiki/Roofline_model |
|
[1]: https://www.nvidia.com/en-us/on-demand/session/gtcspring21-s...