|
|
|
|
|
by bigyabai
18 days ago
|
|
The M1 Max has an unusably slow GPU for inference. TTFT on real-world contexts can be over 10 minutes. > Nothing new here, apart from being able to use CUDA on a less power hungry system. CUDA has been running on ARM SOCs since the Tegra K1, 12 years ago. Nvidia is not new to ARM, nor is CUDA. |
|