| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by NitpickLawyer 87 days ago
	> I don't see this getting better. We went from 2 + 7 = 11 to "solved a frontier math problem" in 3 years, yet people don't think this will improve?

9 comments

datsci_est_2015 87 days ago

I’ve seen this style of take so much that I’m dying for someone to name a logical fallacy for it, like “appeal to progress” or something.

Step away from LLMs for a second and recognize that “Yesterday it was X, so today it must be X+1” is such a naive take and obviously something that humans so easily fall into a trap of believing (see: flying cars).

link

Gareth321 87 days ago

In finance we say "past performance does not guarantee future returns." Not because we don't believe that, statistically, returns will continue to grow at x rate, but because there is a chance that they won't. The reality bias is actually in favour of these getting better faster, but there is a chance they do not.

link

aspenmartin 86 days ago

this is true because markets are generally efficient. It's very hard to find predictive signals. That is a completely different space than what we're talking about here. Performance is incredibly predictable through scaling laws that continue to hold even at the largest scales we've built

link

Gareth321 86 days ago

I agree this is a new space and prediction volatility is much higher. We have evidence going back to at least 2019 that improvements have been exponential (https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com...). The benchmarks are all over the place because improvements don't happen in a straight line. Even composites aren't that useful because the last 10% improvement can require more effort than the first 90%.

To be frank, from what I can see, even if all progress stopped right now, it would take 1-2 decades to fully operationalise the existing potential of LLMs. There would be massive economic and social change. But progress is not stopping, and in some measurements, continues to improve exponentially. I really think this is incredibly transformative. Moreso than anything humanity has ever experienced. In the last year, OpenAI and potentially Claude have been working on recursive self-improvement. Meaning these models are designing better versions of themselves. This means we have effectively entered the singularity.

link

aspenmartin 85 days ago

I agree with all of this -- the one nit I'll say is that scaling laws (e.g. Chinchilla -- classic paper on this that still holds) are based on next-token log loss on an evaluation set for pretraining, and follow (empirically) very consistent powerlaw relationships with compute / data (there is an ideal mixture of compute + data, and the thing you scale is the compute at this ideal mixture). So that's all I mean by performance -- we do also have as you observe benchmark performance trends (which are measured on the final model, after post-training, RL stages etc). These follow less predictable relationships, but it's the pretraining loss that dominates anyway.

I agree with all of this though

link

andrewflnr 86 days ago

Even more insane than assuming the trend will continue is assuming it will not continue. We don't know for sure (especially not by pure reason), but the weight of probability sure seems to lean one direction.

link

mikkupikku 87 days ago

Logical fallacies are vastly overrated. Unless the conversation is formal logic in the first place, "logical fallacies" are just a way to apply quick pattern matching to dismiss people without spending time on more substantive responses. In this case, both you and the other are speculating about the near future of a thing, neither of you knows.

link

datsci_est_2015 87 days ago

Hard to make a more substantive response when the OP’s entire comment was a one-sentence logical fallacy. I’m not cherry-picking here.

> In this case, both you and the other are speculating about the near future of a thing, neither of you knows.

One of us is making a much grander claim than the other:

  - LLMs have limitless potential for growth; because they are not capable of something today does not mean they won’t be capable of it tomorrow
  - LLMs have fundamental limitations due to their underlying architecture and therefore are not limitless in capability

link

fenomas 87 days ago

The post you replied to was:

> We went from 2 + 7 = 11 to "solved a frontier math problem" in 3 years, yet people don't think this will improve?

All that says is that the speaker thinks models will improve past where they are today. Not that it's a logical certainty (the first thing you jumped on them for), and certainly not anything about "limitless potential for growth" (which nobody even mentioned). With replies like this, invoking fallacies and attacking claims nobody made, you're adding a lot of heat and very little light here (and a few other threads on the page).

link

datsci_est_2015 86 days ago

> All that says is that the speaker thinks models will improve past where they are today. Not that it's a logical certainty

Exceedingly generous interpretation in my opinion. I tend to interpret rhetorical questions of that form as “it’s so obvious that I shouldn’t even have to ask it”.

link

fenomas 86 days ago

> generous interpretation

The term of art for that is steelmanning, and HN tries to foster a culture of it. Please check the guidelines link in the footer and ctrl+f "strongest".

link

mikkupikku 86 days ago

Better put than I could have.

link

graemep 87 days ago

OK, its not a logical fallacy, its a false assumption.

The belief in the inevitability of progress is a bad assumption. Especially if you assume a particular technology will keep advancing.

link

mikkupikku 86 days ago

We won't know if his assumption is false until time passes and moves future speculation into the empirical present.

link

graemep 86 days ago

A possibility is not a fact. Assuming a possibility will happen is not justified. Therefore it is false as an assumption, even if it is true it is a possiblity.

link

mikkupikku 86 days ago

I genuinely have no idea what you're on about. One guy expressed his belief about how the future will play out, and another disagreed. Time will be the judge of it, not either of us.

link

aspenmartin 86 days ago

Hmm...the sun comes up today is a pretty good bet that the sun comes up tomorrow.

We have robust scaling laws that continue to hold at the largest scales. It is absolutely a very safe bet that more compute + more training + algorithmic improvements will certainly improve performance it's not like we're rolling a 1 trillion dollar die.

link

famouswaffles 87 days ago

Well if people give the exact same 'reasons' why it could not do x task in the past that it did manage to do then it is tiring seeing the same nonsense again. The reason here does not even make much sense. This result is not easily verifiable math.

link

torginus 87 days ago

Yeah, and even if we accept that models are improving in every possible way, going from this to 'AI is exponential, singularity etc.' is just as large a leap.

link

tim333 86 days ago

The comment doesn't say it must be X+1. It implies it will improve which I would say is a pretty safe bet.

link

botro 86 days ago

How about 'slippery incline'?

link

gf000 87 days ago

https://xkcd.com/605/

link

snemvalts 87 days ago

Scaling law is a power law , requiring orders of magnitude more compute and data for better accuracy from pre-training. Most companies have maxed it out.

For RL, we are arriving at a similar point https://www.tobyord.com/writing/how-well-does-rl-scale

Next stop is inference scaling with longer context window and longer reasoning. But instead of it being a one-off training cost, it becomes a running cost.

In essence we are chasing ever smaller gains in exchange for exponentially increasing costs. This energy will run out. There needs to be something completely different than LLMs for meaningful further progress.

link

Validark 87 days ago

I tend to disagree that improvement is inherent. Really I'm just expressing an aesthetic preference when I say this, because I don't disagree that a lot of things improve. But it's not a guarantee, and it does take people doing the work and thinking about the same thing every day for years. In many cases there's only one person uniquely positioned to make a discovery, and it's by no means guaranteed to happen. Of course, in many cases there are a whole bunch of people who seem almost equally capable of solving something first, but I think if you say things like "I'm sure they're going to make it better" you're leaving to chance something you yourself could have an impact on. You can participate in pushing the boundaries or even making a small push on something that accelerates someone else's work. You can also donate money to research you are interested in to help pay people who might come up with breakthroughs. Don't assume other people will build the future, you should do it too! (Not saying you DON'T)

link

3abiton 87 days ago

The problem class is rather very structured which makes it "easier", yet the results are undeniably impressive

link

number6 87 days ago

But can it count the R's in strawberry?

link

Paradigma11 87 days ago

That question is equivalent to asking a human to add the wavelengths of those two colors and divide it by 3.

link

snovv_crash 87 days ago

Unless you're aware of hyperspectral image adapters for LLMs they aren't capable of that either.

link

szszrk 87 days ago

Unfair - human beats AI in this comparison, as human will instantly answer "I don't know" instead of yelling a random number.

Or at best "I don't know, but maybe I can find out" and proceed to finding out/ But he is unlikely to shout "6" because he heard this number once when someone talked about light.

link

koliber 87 days ago

> human will instantly answer "I don't know" instead of yelling a random number.

Seems that you never worked with Accenture consultants?

link

szszrk 87 days ago

Fair.

Yet this can be filtered with fixed rules, like "output produced by corporate structures is untrusted random data".

link

thegabriele 87 days ago

Why is that?

link

Paradigma11 87 days ago

Because LLMs dont have a textual representation of any text they consume. Its just vectors to them. Which is why they are so good at ignoring typos, the vector distance is so small it makes no difference to them.

link

Aditya_Garg 87 days ago

yes its ridiculously good at stuff like that now. I dare you to try and trick it.

link

frizlab 87 days ago

https://news.ycombinator.com/item?id=47495568

link

thedatamonger 87 days ago

what bothers me is not that this issue will certainly disappear now that it has been identified, but that that we have yet to identify the category of these "stupid" bugs ...

link

sigmoid10 87 days ago

We already know exactly what causes these bugs. They are not a fundamental problem of LLMs, they are a problem of tokenizers. The actual model simply doesn't get to see the same text that you see. It can only infer this stuff from related info it was trained on. It's as if someone asked you how many 1s there are in the binary representation of this text. You'd also need to convert it first to think it through, or use some external tool, even though your computer never saw anything else.

link

Measter 87 days ago

> It's as if someone asked you how many 1s there are in the binary representation of this text.

I'm actually kinda pleased with how close I guessed! I estimated 4 set bits per character, which with 491 characters in your post (including spaces) comes to 1964.

Then I ran your message through a program to get the actual number, and turns out it has 1800 exactly.

link

datsci_est_2015 87 days ago

Okay but, genuinely not an expert on the latest with LLMs, but isn’t tokenization an inherent part of LLM construction? Kind of like support vectors in SVMs, or nodes in neural networks? Once we remove tokenization from the equation, aren’t we no longer talking about LLMs?

link

nopinsight 87 days ago

LLMs in some form will likely be a key component in the first AGI system we (help) build. We might still lack something essential. However, people who keep doubting AGI is even possible should learn more about The Church-Turing Thesis.

https://plato.stanford.edu/entries/church-turing/

link

gf000 87 days ago

AGI is definitely possible - there is nothing fundamentally different in the human brain that would surpass a Turing machine's computational power (unless you believe in some higher powers, etc).

We are just meat-computers.

But at the same time, there is absolutely no indication or reason to believe that this wave of AI hype is the AGI one and that LLMs can be scaled further. We absolutely don't know almost anything about the nature of human intelligence, so we can't even really claim whether we are close or far.

link

benterix 87 days ago

This is a long read on things most people here know at least in some form. Could you pint to a particular fragment or a quote?

link

zeroonetwothree 86 days ago

> We went from 2 + 7 = 11 to "solved a frontier math problem" in 3 years, yet people don't think this will improve?

This is disingenuous... I don't think people were impressed by GPT 3.5 because it was bad at math.

It's like saying: "We went from being unable to take off and the crew dying in a fire to a moon landing in 2 years, imagine how soon we'll have people on Mars"

link

eamag 87 days ago

Self driving

link

saidnooneever 87 days ago

if you let million monkeys bash typewriter. something something book

link