Hacker News new | ask | show | jobs
by anvuong 230 days ago
You mean autotune? I think 10 minutes is pretty normal, torch.compile('max-autotune') can be much slower than that for large models.
1 comments

Add to that it can be done only once by developers before distribution for major hardware. Configs saved. Then on client side selected.