|
|
|
|
|
by pxc
328 days ago
|
|
This feels way less annoying to use than ChatGPT. But I wonder how much the effect is lost when the tool does many of the things that make models like o3 useful (repeated web searches, running code in a sandbox, etc.). For code generation, this does seem pretty useful with something like Qwen3-Coder-480B, if that generates good enough code for your purposes. But for chat, I wonder: does this kind of speed call for models that behave pretty differently to current ones? With virtually instant speed, I find myself wanting much shorter answers sometimes. Maybe a model whose design and training are focused on concision and a context with lots and lots of turns would be a uniquely useful option with this kind of hardware. But I guess the hardware is really for training, right, and the inference-as-a-service stuff is basically a powerful form of marketing? |
|