Hacker News new | ask | show | jobs
by eurekin 877 days ago
I don't remember exactly (either cuda directly or the cudnn version used by the flashattention)... Anyway, /r/localLlama has few instances of such builds. Might be really worthwhile looking that up before buying