|
|
|
|
|
by dcchambers
2 hours ago
|
|
I think hardware like this is the future for LLM-providers once we reach a point where the models aren't advancing much any more. You could argue we're close now. The hyperscalers like AWS will made great use of these to serve up models that will be relevant for several years. But right now, we're still seeing significant bumps in model quality every couple of months - especially with open-weight models like Deepseek/Kimi/GLM. Until that point, though, I don't see how this is ever going to be cost effective vs general purpose hardware. I also think we'll see miniature versions of this baked into mobile hardware for super fast and efficient on-device LLMs. |
|
1. If LLMs keep improving, burning models onto silicon becomes obsolete too fast and is not worth doing. Outcome: We keep getting better LLMs. 2. If LLM improvements slow down, they will be burned onto silicon. Outcome: We get faster, cheaper and energy-efficient LLMs.
Either way sounds great to me. It will certainly be a mix so we can even get both.