Hacker News new | ask | show | jobs
Implementing a Fast Tensor Core Matmul on the Ada Architecture (spatters.ca)
2 points by skidrow 335 days ago
1 comments

This is incredibly useful. Thanks for making the kernels public.

I'm curious if anyone has tried generalizing this to batched matmuls or to sparse inputs on Ada?