|
|
|
|
|
by ijk
372 days ago
|
|
For LLM inference parallel GPUs is mostly fine (you take some performance hit but llama.cpp doesn't care what cards you use and other stuff handles 4 symmetric GPUs just fine). You get more problems when you're doing anything training related, though. |
|