|
|
|
|
|
by m00x
559 days ago
|
|
This is the most script kiddy comment I've seen in a while. llama.cpp is just inference, not training, and the CUDA backend is still the fastest one by far. No one is even close to matching CUDA on either training or inference. The closest is AMD with ROCm, but there's likely a decade of work to be done to be competitive. |
|
Keep NVIDIA for training and Intel/AMD/Cerebras/… for interference.