|
|
|
|
|
by germanjoey
394 days ago
|
|
TBH, the 2x-4x improvement over a naive implementation that they're bragging about sounded kinda pathetic to me! I mean, it depends greatly on the kernel itself and the target arch, but I'm also assuming that the 2x-4x number is their best case scenario. Whereas the best case for hand-optimized could be in the tens or even hundreds of X. |
|
Fwiw, they did say they got up to 20x improvement but given the issues we both mention this may not be surprising given that this seems to be an outlier by their own omission.