|
|
|
|
|
by pseudosavant
8 days ago
|
|
Models this small and this capable bode really well for the usefulness of a PC like the RTX Spark that Nvidia/Microsoft announced this week. 128GB of unified memory will likely be more than sufficient for effective local agentic coding, even if SOTA cloud models will still be even better. Up until this point, I've found the cost/value to unequivocally favor using a cloud subscription, but I would be lying if I didn't worry that one day OpenAI is going to increase the price for my subscription by 5-10x. I rely on these tools enough that if there is a real viable local option, I'm going to take it. |
|
Not really. There's a reason the announcement didn't include ANY benchmark (!) and didn't mention EXACTLY what is the memory bandwidth. It's going to be dog-slow unusable for large models, as tok/sec is basically bandwidth divided by active weights. Rumoured 300GB/s / 30GB active weights (decent model) = 10 tokens per second, which is really slow