Hacker News new | ask | show | jobs
by petargyurov 279 days ago
I think it already has.

We'll get more incremental updates and nice features:

* more context size

* less hallucinations

* more prompt control (or the illusion of)

But we won't get AGI this way.

From the very beginning LLMs were shown to be incapable of synthesising new ideas. They don't sit there and think; they can only connect dots within a paradigm that we give them. You may give me examples of AI discovering new medicines and math proofs as a counter-argument but I see that as re-enforcing the above.

Paired with data and computional scaling issues, I just don't see it happening. They will remain a useful tool, but won't become AGI.

And whether they stay affordable is a question of time; all the big players are burning mountains of cash just to edge out the competition in terms of adoption.

Is there a level of adoption that can justify the current costs to run these things?

1 comments

> incapable of synthesising new ideas

I'd argue they don't synthesize any ideas, even old ones. They skip that classic step to emit text, and the human reading that text generates their own idea and (unconsciously, incorrectly) assumes there must've an original idea that caused the text on the other side.

So perhaps it's more like: "LLMs aren't great at indirectly triggering humans into imagining useful novel ideas." (Especially when the user is trying to avoid having to think.)

Yeah, I know, it sounds like quibbling, but I believe it's necessary. This whole subject is an epistemic and anthropomorphic minefield. A lot of our habitual language connotations and metaphors can mislead.

Well... ideas are actually encoded (to some degree) in the words in the training data. So when they synthesize new text, they are, to a degree, synthesizing new ideas.

To a degree. The problem is that they don't actually understand the ideas in the training data. (Yeah, you can say we don't know how humans actually understand ideas. True, but not the point. However we understand ideas, LLMs don't do that.) And so they can only synthesize new ideas by rearranging words. This is much less than that human thinking. In particular, it seems that it could only generate ideas that are only new recombinations, not breakthrough ideas.

> Well... ideas are actually encoded (to some degree) in the words in the training data. So when they synthesize new text, they are, to a degree, synthesizing new ideas.

I don't think that follows: Manipulating a (lossy, imperfect) encoding [0] isn't the same as manipulating the thing it was intended to evoke.

If it is true, then... Well, it's not true in the same way anybody is excited about, because it means "synthesizing new ideas" is something we've been able to do for many decades and which you can easily script up right now at home [1].

[0] https://en.wikipedia.org/wiki/Encoding_(semiotics)

[1] https://benhoyt.com/writings/markov-chain/

Exactly this. LLMs synthesize language, based solely on statistical data, shorn of semantics. The "magic" happens when we humans re-insert the semantics into the LLM's output--because that is why language is used: to convey meaning--and assume the meaning was there all along.
Aren’t the semantics in the statistical data?
No. One can safely infer the semantics, because how we generally use language is what is encoded, and that how is closely related to the semantics. I imagine that this is why so-called "hallucinations" occur, when there is a subtle (or not so subtle) disconnect between the usage statistics and the semantics. (For example, satire or sarcasm isn't understood by the LLMs, so we get advice like use glue to make cheese 'stick' to pizza.)
Can you prove that statistics cannot encode semantics?
Compare: "Can you prove that alien explorers cannot make contact with us?"

Nobody has the tools to begin proving a negative [0] in either of those cases, and it's possible they'll eventually occur... But so what?

Just because it could happen someday does not mean it's happening now. Instead, we have decades of seeing humans excite themselves into perceiving semantics that aren't present [0], and nobody's provided a compelling reason to believe that this time things are finally different.

[0] https://en.wikipedia.org/wiki/Burden_of_proof_(philosophy)

[1] https://en.wikipedia.org/wiki/ELIZA_effect

I don't think this is the unprovable you think it is?

If LLMs and statistics can't encode semantics, how can do chatbots perform long-form translations with appropriate contexts? How do codebreakers use statistics to break an adversary's communications?

Sometimes the statistics are semantic, like when "orange" and "arancia" the picture of that fruit all mean the same thing, but Orange the wireless carrier and orange the color are different. Those are connections/probabilities humans also learn via repeated exposure in different contexts.

I'm not arguing that LLMs are synthesizing new ideas (or old ones), but that they ARE capable of deriving semantic meaning from statistics. Rather than:

> language, based solely on statistical data, shorn of semantics

Isn't it more like:

> language, based solely on statistical data, with meanings emerging from clusters in the data

Fair, the word "semantics" probably shouldn't be used here, because, strictly speaking, it is a departure from the original "ideas" being discussed.

A system of vectors for man + royalty = king may capture relationships of meaning that we invested into a language, but does it conceive ideas?