Hacker News new | ask | show | jobs
by Etheryte 241 days ago
You can already do that, just how slow or fast you go depends on how much you're ready to pay for memory. It's a $1200 premium to go from 36GB to 128GB of unified memory, that cost is hard to justify unless you really need it, or if someone else is paying.
1 comments

None is comparable to GPT-5 or Sonnet 4.5 experience
Frankly, right now I am way more satisfied with qwen-3-coder-420 using Cerebras inference than with those more powerful models.

Inference speed and fast feedback matter a lot more than perfect generation to me.

Yet.