Hacker News new | ask | show | jobs
by ramesh1994 1142 days ago
I've been looking for a course like this! Especially great given how much of the recent progress in training large models is made possible with the aid of flash attention and fused kernels