|
|
|
|
|
by sgt
470 days ago
|
|
I'm trying to find out about that as well as I'm considering a local LLM for some heavy prototyping. I don't mind which HW I buy, but it's on a relative budget and energy efficiency is also not a bad thing. Seems the Ultra can do 40 tokens/sec on DeepSeek and nothing even comes close at that price point. |
|
The OG DeepSeek models are hundreds of GB quantized, nobody is using RTX GPUs to run them anyway…