Hacker News new | ask | show | jobs
by gpm 2 hours ago
Yes, but with current architectures world knowledge is baked into the weights. We might stop figuring out how to make models better, but the world keeps changing, science is going to keep making progress at understanding the world, etc. This creates a significant minimum rate of change and I'm pretty skeptical that it's worth baking weights into silicon as a result.
3 comments

I think it would just be an opportunity to sell another chip a few years down the line. If the utility curve flattens out on the performance of models I can see a future where you are buying an up to date chip every few years to upgrade to the latest and greatest, while providing up to date context as part of the user input. Like if I have a programming task and I supply a copy of up-to-date documentation alongside my input, I would think that I could still get good output out of a dated model.
That's why we have reasoning/CoT LLMs that can use tools to get updated information.
I mean it just depends on the price of the chip. You might just replace the chip like you would any other component. Like a video game cartridge or something.