Hacker News new | ask | show | jobs
by toss1 1249 days ago
Yup. What seems to be largely missed is that these models have zero understanding, and are actually destroyers of information, not creators. In classic Information Theory, information is basically surprise value — how much unexpected info is in the message? — yet these "AI" systems put out the most expected subset in each instance. This highly averaged output is very recognizable and so very striking, but it is not actually very informative (perhaps except in cases where it is specifically used as a verbose search engine, where the query takes advantage of the breadth of the AI's training).
1 comments

> In classic Information Theory, information is basically surprise value — how much unexpected info is in the message? — yet these "AI" systems put out the most expected subset in each instance.

Forgive me, but isn't this kind of moving-the-goalposts? Information is the surprise value from the recipient's point of view, which meas the recipient's bayesian prior probability is "expected". Saying "these "AI" systems put out the most expected subset in each instance" assumes that the recipient's priors exactly equal those of the model which would only be the case when the model is talking to itself. (or I suppose to an even more complex model with perfect knowledge of ChatGPT's weights)

The fact that no information is transferred when the model talks to itself should not be surprising and would apply to any AI. (even including a superhuman post-singularity god-like AI)

Yes, the AI's output could be surprising to the point of view of many recipients.

This does not mean anything more than that the AI has a greater breadth of training background, which is likely.

We get the output most likely expected from any of (or the average of) the humans whose writing/drawing/whatever was included in the input set.

What we will not be getting from the AIs is any creative output based on unique understanding, as we would from an intelligent, creative human. Many of hte humans in the input set would see the same prompt and produce an actual novel and meaningful output, not simply a cut-and-paste from prior works. (& yes, seme novel output may come from some randomizing algo, but if it is correct, it is no more correct than the broken clock that is correct twice every day.)

Or, another example, I was involved in a legal deposition where an "AI" transcription system was used instead of a skilled court reporter. The output LOOKED fantastic, until I actually read it, and it was absolute garbage. The standard errata sheet has room for the deponent to put in about a dozen corrections, and most are less than a handful. My errata list was multiple pages. These errors often reversed the meaning of sentences, substitutin "I have ..." for "You have...", dropping or adding "not", or substituting in common names for unusual names (e.g., "Jack Kennedy" for "John Kemeny". note human transcribers always ask for correct spellings of names in the next break, this crap just inserted it like it had a clue).

So, even though the total "experience" or training set of the may go beyond the experience of the reader, so some of the output is surprising, this is no more so than a search engine produces surprise. In fact, I think this is the best use of the AIs, to have them trained on an enormous data set, and provide possibly better results, defined as more on-point, but likely less thorough.

>This does not mean anything more than that the AI has a greater breadth of training background, which is likely.

This has nothing to do with Shannon's definition of information. The light from a star going supernova or a tsunami hitting shore convey information without any kind of agency nor intent being involved.

>What we will not be getting from the AIs is any creative output based on unique understanding, as we would from an intelligent, creative human.

This is a really big assumption here that I don't think is justified. Are you assuming that humans have some non-physical soul which makes us somehow different from any other deterministic information-processing system?

>>The light from a star going supernova or a tsunami hitting shore convey information without any kind of agency nor intent being involved.

Right, and I'm obviously speaking loosely, but the amount of information or information value is the amount of non-redundant 'surprise'.

Receiving the supernova's signal for the first time contains valuable information, but replaying it does not.

>>Are you assuming that humans have some non-physical soul which makes us somehow different from any other deterministic information-processing system?

No, I'm speaking about the current and near-intermediate-term state of AI. This current state is basically a mashup of everything it's 'seen' in training, but without a shred of understanding. It only spits back what is statistically the most likely string of words to occur in any situation.

Ordinary humans do far better because they have understanding. A simple example shows this: Asking ChatGPT this question: "Mikes mum had 4 kids; 3 of them are named Luis, Drake, and Matilda. What is the name of the 4th kid?" [0] ChatGPT utterly fails, even when told the answer is in the question.

As in my example with the deposition, it looks great but it is just a literal mashup of training data, and worse yet, is only outputting the most average data. This produces surprisingly emotionally satisfying results, which may be useful in some contexts (marketing copy?) but isn't close to the reasoning of an ordinary human.

I'm making no claim about whether AI can get there. I suspect it ultimately will. BUT, this layer is only a small component of what must be built to get to actually abstract concepts and wield them like any child. My minor in college was neuroscience and I still follow it a bit; we're a LOOONG way from understanding what it is to get and weild concepts, nevermind how to model or reproduce that in a computer.

That's all I'm saying - the current crop is impressive, maybe even useful in some applications, but nowhere near the kind of "intelligence" that is being touted.

[0] https://twitter.com/FeraSY1/status/1614976003092234241