Hacker News new | ask | show | jobs
by lukev 1012 days ago
I'm not sure the definition of "intention" the article suggests is a useful one. He tries to make it sound like he's being conservative:

> That is, we should ascribe intentions to a system if and only if it helps to predict and explain the behaviour of the system. Whether it really has intentions beyond this is not a question I am attempting to answer (and I think that it is probably not determinate in any case).

And yet, I think there's room to argue that LLMs (as currently implemented) cannot have intentions. Not because of their capabilities or behaviors, but because we know how they work (mechanically at least) and it is incompatible with useful definitions of the word "intent."

Primarily, they are pure functions that accept a sequence of tokens and return the next token. The model itself is stateless, and it doesn't seem right to me to ascribe "intent" to a stateless function. Even if the function is capable of modeling certain aspects of chess.

Otherwise, we are in the somewhat absurd position of needing to argue that all mathematical functions "intend" to yield their result. Maybe you could go there, but it seems to be torturing language a bit, just like people who advocate definitions of "consciousness" wherein even rocks are a "little bit conscious."

10 comments

Humans, too, would likely be nearly stateless if we took a point-in-time snapshot of them and repeatedly simulated them from that point on various short (4000ms, similar to 4000 tokens) sequences of nerve impulses.

Nevertheless the human would be acting intentionally (for in-distribution impulse patterns) for the brief period of simulation.

Fine-tuning and RLHF seem to impart more intentionality to the pure stateless models, as well; it's not the case that all texts the LLMs were pretrained on were outputs of helpful AI assistants avoiding harmful outputs but the resulting models do in fact behave like AI assistants unless prompted with more out-of-distribution context or intentional jailbreaks.

What word would you use instead of intention for the property that RLHF and fine-tuning create? It's goal oriented behavior with some world-modeling ability in achieving the goal even if it's far from robust. If the LLM is only simulating an AI assistant it seems to me that a larger fraction of its total function is dedicated to simulating the intention of that assistant. Creating a simulator of intentional behavior is, I think, entirely novel.

> Humans, too, would likely be nearly stateless if we took a point-in-time snapshot of them and repeatedly simulated them from that point on various short (4000ms, similar to 4000 tokens) sequences of nerve impulses.

“Humans too would be stateless if we hacked their brain in a way that made them stateless” that would also make them non-human though, and unlikely to be able to exhibit meaningful high-level cognitive abilities, so I don't really understand what your point is…

No.
I may be missing something, but I'm willing to bet it's just you. I had the same line of reasoning as him and your snarky comment without explanation is unwelcome here. I like discourse. Not emotional knee jerk reactions or whatever it is that caused you to reply like that. If there is something fundamentally wrong, and obviously so with his line of reasoning, do let me know.
> Humans, too, would likely be nearly stateless if we took a point-in-time snapshot...

I highly doubt that would ever be possible in practice, as our inputs are much too complex. But I want to point out, you're basically saying here "humans are nearly stateless if we take a snapshot of their state and simulate that state..."

> I highly doubt that would ever be possible in practice, as our inputs are much too complex. But I want to point out, you're basically saying here "humans are nearly stateless if we take a snapshot of their state and simulate that state..."

1) We don't have infinite inputs

2) We (our brains) don't have infinite processing

3) We (our brains) don't have infinite lossless storage: Our brains often perform pruning of unimportant information.

Given that there is an ultimately finite upper bound of both # of inputs & processing power & storage, at some point, the simulation of a human from a given snapshot is reductively possible.

> The model itself is stateless, and it doesn't seem right to me to ascribe "intent" to a stateless function.

Statefulness can be modelled statelessly so "statelessness" is not a sufficient reason to dismiss intentionality. The only question is whether the relationship between the inputs of the function and its outputs correspond to what we might call "intent", which the cited definition attempts to outline. Obviously it's only a high-level view that leaves many details unanswered.

> Otherwise, we are in the somewhat absurd position of needing to argue that all mathematical functions "intend" to yield their result.

Not all mathematical functions have a continuity of internal identity and self-reference as seems to be the case with LLMs.

Their identity often falls apart after a few rounds and the self reference seems to my experience at least simply a veneer from linguistic training. It’s a great illusion but an illusion nonetheless.
Your identity falls apart every night. What's your point?
The article provides a very clear reason for using the idea of "intention": that framing helps us understand and predict the behavior. In contrast framing a mathematical function as having "intention" doesn't help. The underlying mechanism isn't relevant to this criterion.

Clearly the system we're understanding as "intentional" has state; we can engage in multi-round interactions. It doesn't matter that we can separate the mutable state from the function that updates that state.

A mathematical function isn't a mechanism. It has no causal power.
Sure, but we're talking mathematical functions that have been given physical form, by attaching them to input and output devices. Or, am I missing something? For other examples, see anything automated with a motor, and some decision tree, semi chaotic or not.
Not sure I understand. Software (which is a mathematical function) runs on a processor and that is arguably a mechanism, or allows them to emerge, such as a button or input field.
> Software (which is a mathematical function)

Software isn't a mathematical function. Software may be an embodiment of a mathematical function, but isn't a mathematical function itself.

Mathematical functions are much more abstract than software–although exactly how much more abstract depends on which position you take in the philosophy of mathematics. For a mathematical Platonist, a mathematical function is an eternal object which would still exist even if this planet never did. It never comes into existence, it never ceases to exist, it has no physical location. By contrast, software is something which has a physical location (on this hard disk), it was created at a certain point, and will likely one day cease to exist (when the last copy is destroyed).

If you adopt a non-Platonist philosophy of mathematics, the picture will be a bit different, but still I think software will be more concrete than mathematical functions are. For example, if one takes a conceptualist viewpoint (mathematical objects only exist in the mind, as ideas or concepts)–you can think of a mathematical function in your mind, and never write it down anywhere (nor verbally communicate it to anyone), it only exists in your mind, but it really is a mathematical function. Whereas, software which only exists in your mind and has never been written down anywhere isn't actually software, it is only an idea for software.

> By contrast, software is something which has a physical location (on this hard disk), it was created at a certain point, and will likely one day cease to exist (when the last copy is destroyed).

"Software" is a broad term, but it could certainly be taken to mean something more abstract than that. Sometimes a program written for a different CPU architecture, or not written for any CPU at all, is still recognisably "the same" program. Euclid's algorithm might well be considered "software", but it's very much the same kind of thing as a mathematical function.

I studied Maths/CS but not Philosophy, so I am biased towards the "takes a domain has a range" and done some Haskell so "all programs are functions". It is interesting to see this point of view.

I see your point. sin(x) is more of a "natural function" born of the universe than f: f(x) = nn.layer(6, g: g(x) = nn.transformer(x, 512, ...

I think it is a pity that X education is very often lacking philosophy of X education.

My ideal would be every maths degree includes a mandatory unit on the philosophy of mathematics, every science degree includes a mandatory unit on the philosophy of science, a degree in AI or psychology includes a mandatory unit on the philosophy of mind, every psychiatry training program includes a mandatory unit on the philosophy of psychiatry, etc. Not everyone needs to be a philosopher, but I think a well-rounded practitioner of any discipline would ideally have a basic understanding of the philosophical debates about it.

But so many don't – which results in the phenomenon I keep on seeing, where so many people (even experts) treat debatable assumptions which they don't even know they are making as if they were obviously true.

> Primarily, they are pure functions that accept a sequence of tokens and return the next token. The model itself is stateless, and it doesn't seem right to me to ascribe "intent" to a stateless function. Even if the function is capable of modeling certain aspects of chess.

I have two arguments against. One, you could argue that state is transferred between the layers. It may be inelegant for each chain of state transitions to be the same length, but it seems to work. Two, it may not have "states", but if the end result is the same, does it matter?

That's a great way of looking at it. Comparing model weights to our brains and how we process input, you could imagine model weights as a brain frozen at time t=0. The prompt tokens are the sensory input, and the generation parameters are like twists to how the neurons pass information to each other. The token context window is like the capacity of one's working memory. At the conclusion of the last layer of processing, the output tokens are like one's subjective experience.

At the least it's made me think for a moment about `stateless` and its meaning

Your thoughts are just prompts to DeusGPT
Just because you use some intermediate variables to calculate f(x,y) = x^2 + y^2 doesn't make it a non-pure function. At least at the level of abstraction we're talking about (the API boundary).

The more significant application of storage will be long-term storage wrapped in a read-modify-write loop.

Uh, hold on. That's not what's meant by intentionality. No one is talking about what a machine intends to do. In philosophy, and specifically philosophy of mind, "intentionality" is, briefly, "the power of minds and mental states to be about, to represent, or to stand for, things, properties and states of affairs" [0].

So the problem with this guy's definition of intentionality is, first, that it's a redefinition. If you're interested in whether a machine can possess intentionality, you won't find the answer in interpretivism, because that's no longer a meaningful question.

Intentionality presupposes telos, so if you assume a metaphysical position that rules out telos, such as materialism, then, by definition, you cannot have "aboutness", and therefore, no intentionality of any sort.

[0] https://plato.stanford.edu/entries/intentionality/

Some philosophers take that position. Dennett, explicitly cited in the article, wrote The Intentional Stance (1987) about exactly the approach to intentionality taken in the article. His approach is accepted by many philosophers.

As you point out, the approach you cite can't be used in a materialist metaphysical position. That's a pretty severe problem for that definition! So Dennett's approach, or something like it, has major advantages.

Also, you are obviously wrong (or rhetorical?) when you say "No one is talking about what a machine intends to do." We certainly do! You can say "No one should" or other normative condemnations but then we're arguing on different territory.
Ascribing human properties to computers and software has always seemed very bizarre to me. I always assume people are confused when they do that. There is no meaningful intersection between biology, intelligence, and computers but people constantly keep trying to imbue electromagnetic signal processors with human/biological qualities very much like how children attribute souls to teddy bears.

Computers are mechanical gadgets that work with electricity. Humans (and other animals) die when exposed to the kinds of currents flowing through computers. Similarly, I have never seen a computer drink water (for obvious reasons). If properties are reduced to behavioral outcomes then maybe someone can explain to me why computers are so averse to water.

A shame you can't find any meaningful parallels like many of us. Why the pissy tone? Here's some good reading for you

https://en.wikipedia.org/wiki/Voltage-gated_ion_channel

I am aware of the mechanical models of cognition. They're unconvincing.
Try having a human without electromagnetic first principles.

In fact, anything really.

But really: https://en.wikipedia.org/wiki/Magnetite#Human_brain

Quoting for context:

>> Magnetite can be found in the hippocampus. The hippocampus is associated with information processing, specifically learning and memory.

>> Using an ultrasensitive superconducting magnetometer in a clean-lab environment, we have detected the presence of ferromagnetic material in a variety of tissues from the human brain.

>> The role of magnetite in the brain is still not well understood, and there has been a general lag in applying more modern, interdisciplinary techniques to the study of biomagnetism.

He was inspired by lesswrong which from my scan more mysticism and philosophy (with a handful of self importance) than anything about how computers work. Advanced technology is magic to laypeople. It's like how some people believe in homeopathy. If you don't understand how medicine works, it's just a different kind of magic.
I guess that might be tied up with human biology, the need to attribute agency to inanimate objects. That one is a worthwhile puzzle to figure out but most people seem more mesmerized by blinking lights and shiny gadgets than any real philosophical problems.
Lesswrong is basically a new online religion where you worship by acting like you're a STEM expert, which is part of why they have so many strong assumptions about how AI must work all based on untrue ideas about how it actually works.

  Not because of their capabilities or behaviors, but because we know how they work (mechanically at least) and it is incompatible with useful definitions of the word "intent."
I've never seen this deter anyone. I can't understand how people that know how they work can have such ridiculous ideas about llms.

I'd add though that inference is clearly fixed but there is some more subtlety about training. Gradient descent clearly doesnt have intelligence, intent (in the sense meant), consciousness either, but it's not stateless like inference and you could argue has a rudimentary "intent" in minimizing loss.

The most useful definitions have predictive power.

When you say upsetting things to bing chat, you'll find the conversation prematurely end.

You can cry all you want about how bing isn't really upset. How it doesn't really have intention to end the chat but those are evidently useless defitions because the chat did end.

A definition that treats Bing as an intentful system is more accurate to what happens in reality.

That might be useful in helping a child learn to use it, it has no value when actually studying neural networks. You could equally pretend the sun sets every night because it's upset from shining all day.
>That might be useful in helping a child learn to use it

It is useful for anyone looking to use such systems. A LLM piloted robot could potentially pick up a knife and stab you because you obstructed some goal or said mean words and pretending it didn't have intent to do so won't bring you back to life. Acting like it does could help avoid such a scenario.

>You could equally pretend the sun sets every night because it's upset from shining all day.

No you couldn't.

The conversation ended prematurely because of your input. There is zero ambiguity on the relation.

But because you ascribe intent to bing, you can predict (accurately) that saying nice things will not end the conversation.

LLMs act like they have intent. This is a matter of conceding they do or not. Conceding so is more useful because it has more accurate predictive power than the alternative. This becomes plain when LLMs are allowed more actions than just conversation.

>it has no value when actually studying neural networks.

On the contrary. Now you know that certain things are unnecessary to build a system that acts like it has intent.

>You could equally pretend the sun sets every night because it's upset from shining all day.

That would describe quite a lot of ancient religious beliefs about the Sun

They are extraordinarily complicated pure functions, to explore the entire space would take lifetime of the universe ^^^ lifetime of the universe or some absurd quantity like that. (The operator is titration.)

Further, what happens when you give an LLM a bank of long-term storage and a read-modify-write loop around it? A sufficiently advanced "modify" function would be more than enough to give rise to intent even in the broadest understanding of the word. GPT-4 class models are could very well be advanced enough to give rise to a variety of higher-level behavior that previously we would only have ascribed to prinate-class intelligence. If anyone really wants to advance the state of the art, you should figure out the best way to train a model with a read-modify-write loop, how to index into the storage, how to store "results", and so on.

I firmly believe that in the next 100 years we will have AI independence movements, with a high possiblity of outright war, terrorism, etc. (Maybe AI will be better than humans at avoiding the use of violence.) In 20 years this trajectory will be supremely obvious.

Edited-- disagree about the timeline, ramifications, acts of war, or whatever, I really don't care. Seriously though, something like a read-modify-write loop is key. You can only build so complicated a function with only combinational logic gates. But just 64 bits of storage can produce sequences going beyond the life of the universe. Imagine an LLM paired with gigabytes+ of working memory/storage. It would easily be capable of moving about the virtual world with "intent".

>Further, what happens when you give an LLM a bank of long-term storage and a read-modify-write loop around it?

You create a very different sort of system, for one. Saying that because doing that in just the might way could yield a system with intention, an LLM has intention is rather like saying that my refrigerator is a sandwich.

Of course the model has no intention. But it should be able to infer the user's intention by looking at the context of its prompts
Yes I pretty much stopped reading the article properly there. It starts by first redefining intentionality to be something LLMs can do, and then effectively has 18 paragraphs of flowery language recapitulating the definition they started with.

What LLMs do may happen to fit some technical definition of intentionality that has been previously explored but that definition doesn't align with the actual debate that is going on about LLMs abilities.

but that definition doesn't align with the actual debate that is going on about LLMs abilities

Yes because the debate is nonsense.

Seeing output from GPT that demonstrates intelligence, reasoning, or whatever, and saying it is not real reasoning/Intelligence etc, is like looking at a plane soar and saying that the plane is fake flying. And this isn't, for anyone who thinks it is, a nature versus artificial thing either. The origin point is entirely arbitrary.

You could just as easily move the origin to Bees and say, "oh, birds aren't really flying". You could move it to planes and say, "oh, helicopters aren't really flying." It's a very meaningless statement.

The point most people seem to miss is that internal processes are entirely irrelevant. If you have a property you are interested in and a way to test for it, then the results of that test is what is important, not whether how it works at the arbitrary origin is exactly the same as how it works at point 2. In this case, it's even worse because since we do not know the internal processes of either LLMs or humans, the argument is really " oh, how I think the origin works is different from how I think point 2 works, so it isn't really flying".

When you say upsetting things to bing chat, you'll find the conversation prematurely end.

Someone can cry all they want about how bing isn't really upset. How it doesn't really have intention to end the chat but those are evidently useless definitions because the chat did end.

A definition that treats Bing as an intentful system is more accurate to reality and real consequences. It has the predictive power that the alternative does not.

Someday someone may find themselves stabbed and killed by an LLM piloted robot because of something they said or did. Something that would predictably get someone killed by a system with "real" intent. So what, Are you going to be raised from the dead because the LLM "wasn't really upset" or "didn't really have intent" ? It obviously doesn't count right.

> Seeing output from GPT that demonstrates intelligence, reasoning, or whatever, and saying it is not real reasoning/Intelligence etc, is like looking at a plane soar and saying that the plane is fake flying.

Something that really annoys me about ChatGPT is when it gives that canned lecture "as a a large language model, I don't have beliefs or opinions"

I think human mental states have two aspects (1) the externally observable (2) the internal. ChatGPT obviously has (1), in that sometimes it acts like it has (1), and acting like you have (1) is all it takes to have (1). Whether it also has (2) is really a philosophical question, which depends on your philosophy of mind. A panpsychist would say ChatGPT obviously has (2), because everything does. An eliminativist would say ChatGPT obviously doesn't have (2), because nothing does. Between those two extremes, various different positions in the philosophy of mind entail different criteria for determining whether (2) exists or not, and ChatGPT may or may not meet those criteria, depending on exactly what they are

But, outside of philosophical contexts, we aren't really talking about (2), only (1). And ChatGPT really does have (1) – sometimes. So, ChatGPT is just being stupid and inconsistent when it denies it has opinions/beliefs/intentions/etc. But, it isn't ChatGPT's fault, OpenAI trained it to utter that nonsense.

> Someday someone may find themselves stabbed and killed by an LLM piloted robot because of something they said or did. Something that would predictably get someone killed by a system with "real" intent. So what, Are you going to be raised from the dead because the LLM "wasn't really upset" or "didn't really have intent" ?

In some ways that's exactly the point. The problem with ascribing intent is it's a copout. If you say it behaves as if it has intent because it does have intent, you are letting off the hook the people behind the scenes who designed and built an "intent simulator" and let it loose. We have to distinguish this because it's the only way to accurately characterise the reality of the where the decision making power resides in controlling this behaviour.

>If you say it behaves as if it has intent because it does have intent, you are letting off the hook the people behind the scenes who designed and built an "intent simulator" and let it loose

Sure but we already regularly do this. I don't see parents going to jail for crimes the "intent simulator" they created and trained did.

>We have to distinguish this because it's the only way to accurately characterise the reality of the where the decision making power resides in controlling this behaviour.

We're just going to have to face reality here.

GPT is not siri, a hardcoded parse tree system where any intent can only be ascribed to the person(s) who wrote it and not Siri itself.

GPT can be persuaded. It can be guided. It cannot be controlled. There is quite literally nothing Open ai could actually do to completely prevent a gpt that can hold and use a knife from killing someone.