Y
Hacker News
new
|
ask
|
show
|
jobs
by
deepnotderp
3037 days ago
You use the loop based GEMM kernel and inject the loop counters as the input size.
1 comments
grandmczeb
3037 days ago
L can be as small as 1 and bigger than 512. For small L it makes sense to do different optimizations than large L. A loop based GEMM doesn’t help with that.
link