| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mjburgess 1543 days ago

That's not a model of language. Language is a communicative activity between language users, who do things with words, with each other.

What you're talking about is ignoring the entire empirical context of langauge, as a real-world phenomenon, and modelling is purely formal characteristics as recorded post-facto.

This will always just produce a system which cannot use langauge, but will only ever appear to within highly constrained -- essentially illusory -- contexts. Its the difference between a system which makes a film by "predicting the next frame", and a making a film by recording actual events that you are directing.

A prediction of a "next frame" is always therefore just going to be a symptom of the frames before it. When I point a camera at something new, eg., an automobile in c. 1900 -- i will record a film that has never been recorded before.

And likewise, with words: we are always in genuinely unquie unprecedented situations. And what we *do with words*, is speak about those situations *to others* who are in them with us... we aim to coordinate, move, and so on with words.

To model *language* isnt to model words, nor text, nor to predict words or text. It is to be a speaker here in the world with us, using language to do *what language does*.

No model of the regularities of text will ever produce a language-user. Language isnt a regularity, like the frames of a film -- its a suit of capacities which are responsive to the world, and enable language users to navigate it.

2 comments

nl 1543 days ago

Until you can make quantifiable predictions of behaviour that you want to see it sounds like your objections are philosophical rather than scientific.

> A prediction of a "next frame" is always therefore just going to be a symptom of the frames before it.

But the physical appearance of the automobile itself was absolutely influenced by what went before - they were called "horseless carriages" after the appearance after all.

And NLP Language Models can produce genuinely original and unique writing. This is a poem a large LM wrote for me:

  The sceptered isle
  Hath felt the breath of Britain,
  Longer than she cares to remember.
  Now are her champion arms outstared,
  Her virgin bosom stained with battle's gore.
  Lords and nobles, courtiers and commons,
  All stand abashed; the multitudinous rout
  Scatter their fears in every direction;
  Except their courage, which, to be perfect,
  Must be all directed to the imminent danger
  Which but now struck like a comet; and they feel
  The blow is imminent

> we aim to coordinate, move, and so on with words.

https://say-can.github.io/

"Robots ground large language models in reality by acting as their eyes and hands while LLMs help robots execute long, abstract language instructions"

link

mjburgess 1543 days ago

Chopping up sequences of film, and stiching them together, based on their prior similarity isn't making a movie -- and that's all you have here. People wrote poetry -- *for the first time* -- to say something about their own environment, that they are present in. All you have here is a system which has remembered a compressed representation of these poems and stiches them together to fool you.

It really is a kind of proto-psychosis to think this machine has written a poem. It has generated the text of a poem.

> quantifiable predictions of behaviour that you want to see

This is trivial. I ask the machine a large number of ordinary questions, eg., "what do you think about what i'm wearing?", "what would it take to change your mind on whether murder is justified?", "do you think you'd like new york?", "could you pass me the salt?", etc. -- a trivial infinity of questions lifted from the daily life of language users.

The machine cannot answer any of those questions. All it will do is generate some text on the occasion that the machine sees that text. This isn't an answer. That isnt the question. The question isnt "summarise a million documents and report an on-average plausible answer to these questions".

When I ask a person any of those questions, if they did that, they wouldnt be answering them. This is trivial to observe.

These systems are just taking modes() of subsets of historical data. That's just what they are. The appearence of their using language is an illusion

To use language is to have something to say, to wish to talk about something. When i say, "I liked the movie!" I am not summarising a million reviews and finding an average sentence. I am thinking about my experience of the movie, and generating a public sharable "text" that aims to communicate what i actually think.

*THAT* is language. Language is your intention to speak *ABOUT* something, and the capacity to generate a public shared set of words which communicate what you are talking about. Any process which begins *without anything to say* cannot ever reach langauge as a capacity.

Langauge, as a capacity, begins by being in the world. No summary of the public statmenets of past speakers has anything to do with being in the world; and having things to say. Chopping that up and stiching it together is a trick.

And this is trivial to show empirically. It is only by having absolutely no study of langauge use can anyone claim that text documents have anything ot do with it. IT's mumbohjumbo.

link

nl 1543 days ago

I see. You believe there is something unmeasureable that matters.

I don't. I believe a perfect simulation of intelligence is intelligence.

link

mjburgess 1543 days ago

It's not unmeasurable. If you ask a friend, "did you like that movie?" would you be happy if they hadnt seen it; didnt know anything about it; etc. etc. and simply generated a response based on some review data they'd read?

Is that what you want from people? You want them just to report a summary of the textbooks, of the reviews of other people? You dont want them to think for a moment, about anything and have something to say?

This is a radically bleak picture; and omits, of course, everything important.

We arent reporting the reports of others. We are thinking about things. That isnt unmeasurable, it is trivial to measure.

Show someone the film, ask them questions about it, and so on -- establish their taste.

NLPs arent simulations of anything. It's a parlour trick. If you want a perfect simulation of intelligence, go and show me one -- I will ask it what it likes; and I doubt it'll have anything sincere to say.

There is no sincerity possible here. These systems are just libraries run through shredders; they havent been anywhere; they arent anywhere. They have nothing to say. They arent talking about anything.

You and I are not the libraries of the world cut up. We are actually responsive to the environments we are in. If someone falls over, we speak to help them. We dont, as if lobtomized, rehearse something. When we use words we use them to speak about the world we are in; this isnt unmeasuarable -- its the whole point.

link

hackinthebochs 1543 days ago

Why do you think a model of intelligence needs to have tastes, values, likes/dislikes, etc for it to be something more than statistics or pattern matching? Why are you associating these consciousness qualities with AGI?

link

mjburgess 1543 days ago

To use a language is just to talk about things. You cannot answer the question, "do you like what i'm wearing?" if you dont have the capacity for taste.

Likewise, this applies to all language. To say, "do you know what 2+2 is?" *we* might be happy with "4" in the sense that a calculator answers this question. But we havent actually used language here. To use language is to understand what "2" means.

In otherwords, the capacity for langauge is only just the capacity to make a public communicable description of the non-linguistic capacities that we have. A statistical analysis of what we have already said, does not have this contact with the world, or the relevant capacities. It's just a record of their past use.

None of these systems are langauge users; none have language. They have the symbols of words set in an order, but they arent talkiung abotu anything, because they have nothing to talk about.

This is, i think really really obvious when you ask "did you like that film?" but it applies to every question. We are just easily satisifed when alexa turns the lights off when we say "alexa, lights off". This mechanical satisifcation leads some to the frankly schiozphrenic conclusion that alexa understands what turning the lights off means.

She doesnt. She will never say back, "but you know, it'll be very dark if you do that!" or "would you like the tv on instead?" etc. Alexa isnt having a conversation with you based on a shared understanding of your environment, ie., using langauge.

Alexa, like all NLP systems, are illusions. You arent speaking to anything. You arent asking anything a question. Nothing is answering you. You are the only thing in the room that understands what's going on, and the output of the system is meaningful only because you read it.

The system itself has no meaning to what its doing. The lights go off, but not because the system understood that your desire. It could not, if it failed to undestand, ask about your desire.

link

FeepingCreature 1543 days ago

> No model of the regularities of text will ever produce a language-user.

No but it will produce language-users, incidentally. Language-users are an irreducible aspect of the underlying regularity in language. Now I'm not saying that "GPT will wake up" purely from language tasks, that GPT will become a language user by being a system that picks up regularities. But for GPT to contain systems like language users, to instantiate language-users, which it has to (on some level) in order to successfully predict the next frame, is already enough to be threatening.

I know that using examples from fiction is annoying, but - purely as a rhetorical aid - consider the Enterprise computer (in Elementary, Dear Data) as GPT, and the Moriarty hologram as an embedded agent. The Enterprise computer is not conscious, but as a very powerful pattern predictor it can instantiate conscious agents "by accident", purely by completing a pattern it has learnt. It doesn't want to threaten the Enterprise, it doesn't want to not threaten the Enterprise, because it doesn't have any intentional stance. Instead, it was asked "A character that can challenge Data is ¬" and completed the sentence, as is its function.

link

mjburgess 1543 days ago

How does the computer answer "Do you like what i'm wearing today?" ?

Well, if we say the computer is, in fact, not participating in the world with us -- it is merely predicting "the next word", then it cannot.

I am not asking for any answer to this question. I want to know what it (like a friend) actually thinks about what i'm wearing.

To do this, it would need to be a competent language user; not a word annoucer. It would, in otherwords, need to know what the language was about -- and need to be able to make a judgement of taste based on its prior experiences, etc.

I dont think our ability to misattribute a capacity of languge to things (eg., to bugs bunny) is salient -- we are fools, easily fooled. Bugs bunny doesnt exist.

In this case, the star trek computer, insofar as it actually answers the questions its asked -- is routinely depicted as being actually present in the world with us. That the show might claim "no it isnt!", or we otherwise hold this premise whilst observing that it is, is just foolishness. Bugs bunny likewise, is depicted with the premise that bugs is within his own world; this likewise, is irrelevant.

link

FeepingCreature 1543 days ago

Well, GPT is not the sort of thing that can have a "you." But it has seen dialogues that have a "you" in it, and it knows how a "you" tends to answer. For instance, depending on context, it may be operating under a different model for the "you" agent - the sort of person who likes a red dress, or the sort of person who likes suspenders. If we assume a multimodal GPT, it's going to draw on its pattern recognition from movies and its context window for what it's previously said as "you" or what you've prompted it as in order to guess what the agent it's pattern completing for "you" would think of your dress.

In effect, I'm saying that just because GPT is not a word-user, that doesn't mean that its model of "you" - the layered system of patterns that generates its prediction for words that come after "I think your dress looks" - isn't a word-user. The "you" model, effectively, takes in sensory input, processes it, and produces output. Because the language model has learnt to complete sentences using agents as predictive patterns - because agents compress language - the you pattern acts agentic, despite the fact that the language model itself is not "committed" to this agent and will, if you reset its context window, readily switch to pattern predicting another agent.

GPT is not an agent, but GPT can predict an agent, and this is equivalent to containing it.

link

mjburgess 1543 days ago

I dont think it is equivalent. If you assume it has the same modal properties, sure -- let's say that's plausible.

Ie., if GPT said on the occasion it was asked Q, an answer A, in a possible world W, such that this answer A was the "relevant and reasonable" answer in W -- then GPT is "doing something interesting".

Eg., if I am wearing red shoes (World W1) and it says "i like your red shoes" in W1, then that's for-sure really interesting.

My issue is that it isnt doing this; GPT is completely insensitive to what world its in and just generates an average A in reply to a world-insensitive Q.

If you take a langauge-user, eg. me, and enumerate my behaviour in all possible worlds you will get somehting like what GPT is aiming to capture. Ie., what i would say, if asked Q, in world-1, wolrd-2, world-infinity.

My capacity to answer the question in "relevant and reasonable" ways across a gegnuine infinity of possible worlds comes from actual capacities i have to obvserve, imagination, explore, question, intereact, etc. It doesnt come from being an implementation of the (Q, A, W) pattern -- which is an infintity on top of an infinity.

No model which seeks to directly implement (Q, A, W) can ever have the same properties of an actual agent. That model would be physically impossible to store. So GPT does not "contain" an agent in the sense that QAW patterns actually occur as they should.

And no route through modelling those patterns will ever produce the "agency pattern". You actually need to start with the capacities of agents themselves to generate these in the relevant situations, which is not a matter of a compressed representation of QAW possibilities -- its the very ability to imagine them peicemeal (investigate, explore, etc .)

link

FeepingCreature 1543 days ago

I mean, how would you discover that you're in world W? If you ask "what do you think about my red shoes?" and I say "I think your red shoes are pretty", then you will say this is just me completing the pattern. But if I have no idea what shoes you're wearing, then even I, surely agreed to be an agent, could not compliment your clothing. So I'm not sure how this distinction works.

> It doesnt come from being an implementation of the (Q, A, W) pattern

Well, isn't this just a (Q, A, W, H) pattern though? You have a hidden state that you draw upon in order to map Qs onto As, in addition to the worldstate that exists outside you. But inasmuch as this hidden state shows itself in your answers, then GPT has to model it in order to efficiently compress your pattern of behavior. And inasmuch as it doesn't ever show itself in your answers, or only very rarely, it's hard to see how it can be vital to implementing agency.

And, of course, teaching GPT this multi-step approach to problem solving is just prompting it to use a "hidden" state, by creating a situation in which the normally hidden state is directly visualized. So the next step would be to allow GPT to actually generate a separate window of reasoning steps that are not directly compared against the context window being learnt, so it can think even when not prompted to. I'm not sure how to train that though.

link

mjburgess 1543 days ago

Sure, GPT has to model H -- that's a way of putting it. However think of how the algorithm producing GPT works (and thereby how GPT models QAWH) -- it produces a set of weights which interpolate between the training data --- even if we give it QAWH as training data, implementing the same QAWH patterns would require more storage capacity than is physically possible.

I think there's a genuine ontological (practical, empirical, also) difference between how a system scales with these "inputs". In otherwords if a machine is a `A = m(Q | World, Hidden)`, and a person is a `A = p(Q | World, Hidden)` then their complexity properties *matter*.

We know that the algorithm which produces `m` does so with exponential complexity; and we know that the algorithm producing `p` doesnt. In otherwords, for a person to answer `A` in the relevant ways, does not require exponential space/time. We know that NNs are already exponential scaling in their parameters in their even fairly radically stupid solutions (ie., ones which are grossly insensitive even to W).

So whilst `m` and `p` are equivalent if all we want is an accurate mapping of `Q`-space to `A`-space, they arent equivalent in their complexity properties. This inequivalence makes `m` physically impossible, but i also think, just not intelligent.

As in, it was intelligent to write the textbook; after its written, the HDD space which stores it isnt "intelligent". Intelligence is that capacity which enables low-complexity systems to do "high-complexity" stuff. In other words, that we can map-out QAWH with physically-possible, indeed, ordinary capacities -- our-doing-that is intelligence.

I think this is a radically empirical question, rather than a merely philosophical one. No algorithm which relies on interpolation of training data will have the right properties; it just wont, as a matter of fact, answer questions correclty.

You cannot encode the whole QAWH-space in parameters. Interpolation, as a strategy, is exponential-scaling; and cannot therefore cover even a tiny fraction of the space.

Ie., if I ask "what did you think of will smith hitting christopher walken?" it is unlikely to reply, "I think you mean Chris Rock" firstly; and then if will does hit walken, to reply, "I think Walken deserved it!".

Interpolation, as a strategy, cannot deal with the infinities that counter-factuals require. We are genuinely able to perform well in an infinite number of worlds. We do that by not modelling QA pairs, at all; nor even the W-infinity.

Rather, we implement "taste, imagination, curiosity" etc. and are able to simulate (and much else) everything we need. We arent an interpolation through relevant hisotry, we are a machine direclty responsible to the local environment in ways that show a genuine deep understanding of the world and abiliyt to similate it.

This ability enables `p` to have a lower complexity than `m`, and thereby be actually intelligent.

As an empirical matter, i think you just can't build a system which actually succeeds in answering the-right-way. It isnt intelligent; but likewise, it also just doesnt work.

link