|
|
|
|
|
by tehsauce
1277 days ago
|
|
“There has also been a wide variety of accuracy-degrading performance optimizations like Xformers and Flash Attention, which are great tools if you are open to trading accuracy for performance” This is incorrect. Those optimizations do identical computations, but leverage memory bandwidth on the gpu more effectively. So there is no accuracy tradeoff there. |
|
That said we (Nod.ai team) will add support for xformers soon so you can opt in for xformers anyway.