|
|
|
|
|
by mrajcok
434 days ago
|
|
Yes - I'm able to run Llama 3.1 405B on 3x A6000 + 3x 4090. Will have Llama 4 Maverick running in 4bit quantization (typically results in only minor quality degradation) once llama.cpp support is merged. Total hardware cost well under $50,000. The 2T Behemoth model is tougher, but enough Blackwell 6000 Pro cards (16) should be able to run it for under $200k. |
|