Hacker News new | ask | show | jobs
by mirekrusin 848 days ago
There are many dimensions where improvements are happening - speed increase, size reduction, precision, context length, using external computation (function calling), using formal systems, hybrid setups, multi-modality etc. If you look at short history of what's happening - we're not seeing below 50% improvements over those relatively short periods of time. We had gpt1 just five and a half years ago. We now have open weight models orders of magnitude better. We know we're feeding models with tons of redundancy and low quality inputs, we know synthetic data can improve and lower training cost dramatically. We know we're not near anything optimal. We'll see orders of magnitude size reductions in coming years etc. Humans don't represent any kind of intelligence ceiling - it can be surpassed and if it can be surpassed and we know humans alone produce well above 50% improvements - it will get better and getting better.

Saying that models will get attracted to bullshit local maximum is similar fallacy to saying that wikipedia will be full of rubbish when it was created. Forces are set up in a way that creates improvements that accumulate, humans don't represent any ceiling and unlike humans models have near zero replication cost, especially time wise.

1 comments

Sure, but it seems that with a fixed amount of hardware or operations there is some sort of efficient frontier across all the axes (speed, generalization, capacity, whatever), so there should logically be a point with diminishing returns and a maximum performance.

Like there is only so much you can do with a single punch card.

Yes, there are physical limits but they are so far off from human pov that they don't matter much.

For example information communication rate that humans can perform (read or write) compared to what computers can do.

Same with information storage, retrieval, precision, computation rate etc.

Sure computes have lots of transistors, but brains have 10s of billions of neurons and only use 12W of power.
If it's smarter than us it's pretty irrelevant whether it takes 12W or 5KW or even 1TW to run. Sure it may stop improving once it's far surpassed Von Neumann-level (at some point nobody knows) due to some physics or unknown information constraints but I don't think that has any practical bearing on much.