|
|
|
|
|
by mrinterweb
80 days ago
|
|
I think two recent advances make your statement more true. The new Qwen 3.5 series has shown a relatively high intelligence density, and Google's new turboquant could result in dramatically smaller/efficient models without the normal quantization accuracy tradeoff. I would expect consumer inference ASIC chips will emerge when model developments start plateauing, and "baking" a highly capable and dense model to a chip makes economic sense. |
|
I could be wrong because I'm not following this too closely, but the open weights future of both Llama and Qwen looks tenuous to me. Yes, there are others, but I don't understand the business model.