|
|
|
|
|
by azeirah
1144 days ago
|
|
I'm following the discussions on GitHub as well as their PRs closely. The primary bottleneck for now is compute. They've recently made a big improvement to performance by introducing partial gpu acceleration if you compile with a gpu accelerated variant of BLAS. Either cublas (Nvidia) or CLBlast (slightly slower but supports almost everything: Nvidia, Apple, AMD, mobile, raspberry pi etc) |
|