Hacker News new | ask | show | jobs
by asaiacai 387 days ago
MFU is probably the best but requires application logic. You can export metrics at the infra level like SM efficiency. We explain it a bit how we used it to do some optimization.

https://www.trainy.ai/blog/gpu-utilization-misleading