|
|
|
|
|
by xphos
485 days ago
|
|
Thank you for that catch. I still think we are comparing ASIC matmul hardware to non ASIC FFT hardware. The given TPU hardware is doing 256x256 matrix multiplication in linear time by using 256x256 multiplier grids. FFT ASIC could like do the same thing but be able to handle a much higher N size before memory becomes the bottleneck. |
|