| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by TeMPOraL 589 days ago

Yes, but!

Exponential pace of progress isn't usually just one thing; if you zoom in, any particular thing may plateau, but its impact compounds in enabling growth of successors, variations, and related inventions. Nor is it a smooth curve, if you look closely. I feel statements like "a 405b model is not 5 times better than a 70b model" are zooming in on a specific class of models so much you can see the pixels of the pixel grid. There's plenty of open and promising research in tweaking the current architecture in training or inference (see e.g. other thread from yesterday[0]), on top of changes to architecture, methodology, methods of controlling or running inference on exiting models by lobotomizing them or grafting networks to networks, etc. The field is burning hot right now, we're counting space between incremental improvements and interesting research directions in weeks. The overall exponent of "language models" power may just well continue when you zoom out a little bit further.

[0] - https://news.ycombinator.com/item?id=42093112