|
|
|
|
|
by jklontz
2020 days ago
|
|
> For example, NN-512 can exceed 48 effective FMADDs per cycle (on the 27 peak FMADD machine) with Winograd-Cook-Toom-Lavin, if the tensor is deep enough (enough channels) Roughly how many channels do you need for this approach to be worthwhile? |
|
So it depends on the cache size, but you can think of it as being about 512 channels in, 512 channels out, something like that