|
|
|
|
|
by YetAnotherNick
281 days ago
|
|
> I'd much prefer paying 3x cost for 3x VRAM Why not just buy 3 card then? These cards doesn't require active cooling anyways and you can just fit 3 in decent sized case. You will get 3x VRAM speed and 3x compute. And if your usecase is llm inference, it will be a lot faster than 1x card with 3x VRAM. |
|
> 3x VRAM speed and 3x compute
LLM scaling doesn’t work this way. If you have 4 cards, you may get 2x performance increase if you use vLLM. But you’ll also need enough VRAM to run FP8. 3 cards would only run at 1x performance.