|
|
|
|
|
by janwas
572 days ago
|
|
Max performance is a stretch - recompilation would not utilize tensor cores, right? "too hard for mainstream programmers" seems overly pessimistic. I've run several workshops where devs have written dot-product kernels using Highway after 30 minutes of introduction. |
|