Hacker News new | ask | show | jobs
by zanussbaum 590 days ago
this was a huge inspiration for the post! i tried to highlight it in the blog but it might have gotten buried

there are a few things that i wasn't able to figure out how to get access to/i wasn't sure if they were possible. for example, a lot of Simon's article takes advantage of the warp scheduler and warp tiling.

i had a hard time finding information on if that's even possible with my M2/metal and the general memory access patterns. it seems like CUDA does have better documentation in this regard