Hacker News new | ask | show | jobs
by gjm11 529 days ago
I think it's remarkable how the goalposts have shifted.

For an AI model to be "a big deal", apparently we need to be able to give it a hard problem in an arbitrary field, one that humans have not yet solved[1], and have it spit out a good solution.

[1] At least, I think that's your intent. I am not a laser expert so I don't have a sense of where your challenge lies on a scale from "known but only to experts" to "major research project, may turn out to be impossible".

I very much agree that an AI system that could do that would be a big deal. An AI that could do that would be a world-changing deal. But it's pretty startling if everything short of that is not "a big deal" now, no?

4 comments

The problem is this is what people are being told is happening. I've talked to laypeople that think chatgpt is a superintelligent thing they get 100% truthful answers from. I saw a podcast last week from a PhD (in an unrelated field) claiming AGI will be here in 2027. As long as there are people out there claiming AI is everything, there will be people that look at whats available and say no, it's not actually that (yet).
respectfully, i feel i am alone in this opinion, but i’m not even remotely convinced that there isn’t a “superintelligent being” hiding in plain sight with tools that we already have at hand. people always grouse about the quality of LLM outputs, and then you realize that they (tend to) think that somehow the LLM is supposed to read their minds and deliver the answer they “didn’t need, but deserved”… i’d take my chances being dumped in 12 th century england getting bleated at in old english over being an LLM that has to suffer through a three sentence essay about someone’s brilliant, life-altering startup idea, having to grapple with the overwhelming certainty that there is absolutely no conceivable satisfactory answer to a question poorly conceived.

for all we (well, “i”, i guess) know, “superintelligence” is nothing more than a(n extremely) clever arrangement of millions of gpt-3 prompts working together in harmony. is it really so heretical to think that silicon + a semi-quadrillion human-hour-dollars might maybe have the raw information-theoretical “measurables” to be comparable to those of us exalted organic, enlightened lifeforms?

clearly others “know” much more than i do about the limits of these things than me. i just have spent like 16 hours a day for ~18 months talking to the damned heretic with my own two hands— i am far from an authority on the subject. but beyond the classical “hard” cases (deep math, … the inevitability of death …?), i personally have yet to see a case where an LLM is truly given all the salient information in an architecturaly useful way in which “troublesome output”. you put more bits into the prompt, you get more bits out. yes, there’s, in my opinion, an incumbent conservation law here— no amount of input bits yields superlinear returns (as far as i have seen). but who looks at an exponential under whose profoundly extensive shadow we have continued to lose ground for… a half-century? … and says “nah, that can never matter, because i am actually, secretly, so special that the profound power i embody (but, somehow, never manage to use in such a profound way as to actually tilt the balance “myself”) is beyond compare, beyond imitation— not to be overly flip, but it sure is hard to distinguish that mindset from… “mommy said i was special”. and i say this all with my eyes keenly aware of my own reflection.

the irony of it all is that so much of this reasoning is completely contingent on a Leibniz-ian, “we are living in the best of all possible worlds” axiom that i am certain i am actually more in accord with than anyone who opines thusly… it’s all “unscientific”… until it isn’t. somehow in this “wtf is a narcissus” society we live in, we have gone from “we are the tools of our tools” to “surely our tools could never exceed us”… the ancient greek philosopher homer of simpson once mused “could god microwave a burrito so hot that even he could not eat it”… and we collectively seem all too comfortable to conclude that the map Thomas Acquinas made for us all those scores of years ago is, in fact, the territoire…

'you put more bits into the prompt, you get more bits out.'

I think your line there highlights the difference in what I mean by 'insight'. If I provided in a context window every manufacturing technique that exists, all of base experimental results on all chemical reactions, every known emergent property that is known, etc, I do not agree that it would then be able to produce novel insights.

This is not an ego issue where I do not want it be able to do insightful thinking because I am a 'profound power'. You can put in all the context needed where you have an insight, and it will not be able to generate it. I would very much like it to be able to do that. It would be very helpful.

Do you see how '“superintelligence” is nothing more than a(n extremely) clever arrangement of millions of gpt-3 prompts working together in harmony' is circular? extremely clever == superintelligence

> For an AI model to be "a big deal", apparently we need to be able to give it a hard problem in an arbitrary field, one that humans have not yet solved[1], and have it spit out a good solution.

Once you've been to the moon, the next stage is Mars or Deimos. Humans celebrate progress but also appreciate incremental improvements.

I run an AI/ML consultancy so I have skin in this game. The "traditional" model approaches still have tons, tons, tons of value to offer. Few need to have the frontier right away.

Yes! The ChatGPT moment has warn off. And there hasn't been a step-change other than Claude Sonnet 3.5 + Cursor for dramatic impact (which is only for coding) since then.

I 100% agree with you that AI is fantastic and it is a big deal in general. But now that the world has gotten used to it being able to parrot back something it learned (including reasoning) in the training set, the next 'big deal' is actual insight.

But I see your point, I still think what we have currently is out of a sci-fi book, but I am also not that amazed by computers in our pockets anymore :)

No, and no goalposts have shifted. What's happened instead is that the claims made by LLM makers keep getting more and more outlandish as time passes, and they do that as a response to criticism that keeps pointing out the shortcomings of their systems. Every new model is presented as a breakthrough [1] and its makers rush to show off the results like "the new model is 100% better than the old one in passing the Bar exam!". You can almost hear the unsaid triumphant question hanging in the air "Are you convinced now? Are we having fun yet?".

We're not. The big deal with LLMs is that they are large enough language models that they can generate fluent, grammatical text that is coherent and keeps to a subject over a very, very long context. We never could do this with smaller language models. Because statistics.

What LLMs can absolutely not do is generate novel text. This is hard to explain perhaps to anyone who hasn't trained a small language model but generativity -the ability to generate text that isn't in a training set- is a property of the tiniest language model, as it is of the largest one [2]. The only difference is that the largest model can generate a lot more text.

And still that is not what we mean by novelty. For example, take art. When ancient humans created art, that was a new thing that had never before existed in the world and was not the result of combining existing things. It was the result of a process of abstraction, and invention: of generalisation. That is a capability that LLMs (as other statistical systems) lack.

The goalposts therefore have not moved because the criticism is as old as nails and the LLM makers have still not been able to comprehensively address it. They just try to ignore it. If the goalposts are here and you're shooting goals over there and then doing a little victory run every time the ball breaks Col. Mustard's windows, that's not the goalposts that have moved, it's you that keeps missing them.

_____________

[1] I'm old enough to remember... GPT-3 and how it blew GPT-2 out of the water; GPT-3.5 and how it blew GPT-3 out of the water; GPT-4 and how it blew GPT-3.5 out of the water... And all the users who would berate you for using the older model since "the new one is something completely different". Every single model. A yuuuge breakthrough. What progress!

[2] Try this. Take the sentence "<start> the cat sat on the mat with the bat as a hat <end>" and generate its set of bi-grams ("<start> the", "the cat", "cat sat", etc.). Then generate permutations of that set. You'll get a whole bunch -14!-1, as in |sentence|! minus the initial one- of sentences that were not in the training set. That's generativity in a tiny language model. That's how it works in the largest also, hard as that may be to believe. It shouldn't. It's a very simple mechanism that is extremely powerful. Large models are simply better at assigning weights to permutations so that the ones more often encountered in a corpus are weighted more.

Agreed! Don't get me wrong, the statistical distribution modeling for human language is still SUPER helpful. And for things like legal/tax/coding, which has a lot to do with applying language patterns, this is a very big deal. But the ability to find the true 'sub structure' of content that it is trained on is not something they can do. It is like there is some lower substrate that it is 'missing'. That is a lot to ask for, but once we get there it will be the 'magic' that is promised, rather than amazing, super helpful, parlor tricks.