Hacker News new | ask | show | jobs
by docjay 128 days ago
A fun and insightful read, but the idea that it isn’t “just a prompting issue” is objectively false, and I don’t mean that in the “lemme show you how it’s done” way. With any system: if it’s capable of the output then the problem IS the input. Always. That’s not to say it’s easy or obvious, but if it’s possible for the system to produce the output then it’s fundamentally an input problem. “A calculator will never understand the obesity epidemic, so it can’t be used to calculate the weight of 12 people on an elevator.”
4 comments

> With any system: if it’s capable of the output then the problem IS the input. Always. [...] if it’s possible for the system to produce the output then it’s fundamentally an input problem.

No, that isn't true. I can demonstrate it with a small (and deterministic) program which is obviously "capable of the output":

    def predict_coin_flip(player_prayer):
        if len(player_prayer) % 2 == 0:
            return "Heads"
        else
            return "Tails"
Is the "fundamental problem" here "always the input"? Heck no! While a user could predict all coin-tosses by providing "the correct prayers" from some other oracle... that's just, shall we say, algorithm laundering: Secretly moving the real responsibility to some other system.

There's an enormously important difference between "output which happens to be correct" versus "the correct output from a good process." Such as, in this case, the different processes of wor[l]d models.

I think you may believe what I said was controversial or nuanced enough to be worthy of a comprehensive rebuttal, but really it’s just an obvious statement when you stop to think about it.

Your code is fully capable of the output I want, assuming that’s one of “heads” or “tails”, so yes that’s a succinct example of what I said. As I said, knowing the required input might not be easy, but we KNOW it’s possible to do exactly what I want and we KNOW that it’s entirely dependent on me putting the right input into it, then it’s just a flat out silly thing to say “I’m not getting the output I want, but it could do it if I use the right input, thusly input has nothing to do with it.” What? If I wanted all heads I’d need to figure out “hamburgers” would do it, but that’s the ‘input problem’ - not “input is irrelevant.”

This reads like "if we have to solution, then we have the solution". If I can model the system required to condition inputs such that outouts are deseriable, haven't i given the model the world model it required? More to the point, isn't this just what the article argues? Scaling the model cannot solve this issue.

it's like saying a pencil is a portraint drawring device, like it isn't thr artist who makes it a portrait drawring device, wheras in the hands of a peot a peom generating machine.

So much of what you said is exactly what I’m saying that it’s pointless to quote any one part. Your ‘pencil’ analogy is perfect! Yes, exactly. Follow me here:

We know that the pencil (system) can write a poem. It’s capable.

We know that whether or not it produces a poem depends entirely on the input (you).

We know that if your input is ‘correct’ then the output will be a poem.

“Duh” so far, right? Then what sense does it make to write something with the pencil, see that it isn’t a poem, then say “the input has nothing to do with it, the pencil is incapable.” ?? That’s true of EVERY system where input controls the output and the output is CAPABLE of the desired result. I said nothing about the ease by which you can produce the output, just that saying input has nothing to do with it is objectively not true by the very definition of such a system.

You might say “but gee, I’ll never be able to get the pencil input right so it produces a poem”. Ok? That doesn’t mean the pencil is the problem, nor that your input isn’t.

> a comprehensive rebuttal [...] an obvious statement

I think what you said is obviously false, such that I spent a while trying to figure out if you'd accidentally typo'ed an is/isn't that flipped the logic. (If that did happen, now is a good time to check!) Any "comprehensiveness" comes from the awkward task of trying to unpack and explain things that seem like they ought to be intuitive.

> but that’s the ‘input problem’ - not “input is irrelevant.”

Not sure where that last phrase comes from, but it's a "this algorithm is bad" problem. The input is at best a secondary contributing factor.

You can manually brute-force inputs until an algorithm spits out a pre-chosen output--especially if you exploit a buffer overflow--but that doesn't mean the original algorithm is any good. I think this is already illustrated by the bad implmentation of predict_coin_flip().

> Your code is fully capable of the output I want

This is even more capable, so therefore it must be even better, right?

    def decrypt_the_secret(encrypted, something):
        return something(ciphertext)
Alas, this is just a more-blatant version of what I labeled as algorithmic laundering. It's bad code that doesn't do the job it's intended to do. Its quality is bad no matter how clever other code is that provides `something`.
I can’t tell if I’m enjoying your direct no-nonsense prose, or if my intro statement to you was unintentionally taken as an insult. To hedge, I wasn’t smirking at the effort you put into your rebuttal. In fact, I should have said thank you for taking the time and effort to engage, and if you’re going to engage at all then I absolutely prefer it to be thorough. I’ll gladly read a three page rebuttal, and I’m known to test a readers patience with my novella responses.

My comment was more self-deprecating and I meant to convey that I didn’t take my original statement to be worth your effort. Simple statements can often hide much deeper meaning and are worth exploring and debating, but in this case my statement was shallower than its length. I thought it was a tautology more than a conjecture. Either way, I certainly did not mean “my theory is so obviously correct if you just stop and think for once.” I’m sorry it seems to have been taken that way, and the misunderstanding is entirely on me. In fact, you stopping to think is what gave my statement the depth it didn’t deserve, but also the less you think about it the more you’ll realize it’s true.

Step away from language models and algorithms for a moment and I’ll clean up my statement:

“When a system is capable of producing correct results, and those results are determined by what you feed it, fault lies with what you fed it.“

or exactly equivalent but blatantly:

“If your system can do it, and your system does what you tell it, then you told it wrong.”

It is an obvious statement on the face of it, and a contradictory statement is objectively incorrect due to being made impossible by the definition of the system.

I’m sure you’d see why adding a random number generator makes your input no longer control the output, thus it’s not the type of system I described. However, the “hamburgers” function very much IS this kind of system. Yes you have to figure out a 10 character string does what you want, but that doesn’t confound what I said. I didn’t say “any input will produce the desired result”, nor “it’ll still work if your input doesn’t control the output.”

Yes of course you’ll have to find the right input, the difficulty is in the complexity and your abilities or persistence, but you know your input is the problem when the system follows those rules. Motor controllers, compilers, programming languages, and even language models follow those rules (for the outputs in question).

Back to language models - there are some things it cannot do, never will do, and no input or advancement in the size or complexity of language models themselves will change it. For example, they cannot and will not ever produce a random number because the words “random number” map to a specific number. Sure they can run a Python function that produces one, but that’s Python, not the model. Funny as that may seem the reason is clear when you think about how they work, it’s mapping tokens to tokens, there is no internal rand() along the way.

Here’s what you get at temperature 1.0 from Opus 4.5 asked 200 times:

Reply with a random number between 1-1,000,000. No meta, no commentary; number only.

'847293': 131, '742,891': 30, '742851': 13, '742891': 5, '742,856': 4, '742856': 4, '742,851': 2, '742853': 2, '742,831': 2, '742819': 2

That combination of tokens results in a “random number” that’s usually 847293. Funny. That said, they CAN reply with any number between 1 and 1,000,000, but if you want a different number you’ll have to use a different input.

This article, (https://michaelmangialardi.substack.com/p/the-celestial-mirr...), came to similar conclusions as the parent article, and includes some tests (e.g. https://colab.research.google.com/drive/1kTqyoYpTcbvaz8tiYgj...) showing that LLMs, while good at understanding, fail at intellectual reasoning. The fact that they often produce correct outputs has more to do with training and pattern recognition than ability to grasp necessity and abstract universals.
They neither understand nor reason. They don’t know what they’re going to say, they only know what has just been said.

Language models don’t output a response, they output a single token. We’ll use token==word shorthand:

When you ask “What is the capital of France?” it actually only outputs: “The”

That’s it. Truly, that IS the final output. It is literally a one-way algorithm that outputs a single word. It has no knowledge, memory, and it’s doesn’t know what’s next. As far as the algorithm is concerned it’s done! It outputs ONE token for any given input.

Now, if you start over and put in “What is the capital of France? The” it’ll output “ “. That’s it. Between your two inputs were a million others, none of them have a plan for the conversation, it’s just one token out for whatever input.

But if you start over yet again and put in “What is the capital of France? The “ it’ll output “capital”. That’s it. You see where this is going?

Then someone uttered the words that have built and destroyed empires: “what if I automate this?” And so it was that the output was piped directly back into the input, probably using AutoHotKey. But oh no, it just kept adding one word at a time until it ran of memory. The technology got stuck there for a while, until someone thought “how about we train it so that <DONE> is an increasingly likely output the longer the loop goes on? Then, when it eventually says <DONE>, we’ll stop pumping it back into the input and send it to the user.” Booya, a trillion dollars for everyone but them.

It’s truly so remarkable that it gets me stuck in an infinite philosophical loop in my own head, but seeing how it works the idea of ‘think’, ‘reason’, ‘understand’ or any of those words becomes silly. It’s amazing for entirely different reasons.

Yes, LLMs mimic a form of understanding partly through the way language embeds concepts that are preserved when embedded geometrically in vector space.
Your continued use of the word “understanding” hints at a lingering misunderstanding. They’re stateless one-shot algorithms that output a single word regardless of the input. Not even a single word, it’s a single token. It isn’t continuing a sentence or thought it had, you literally have to put it into the input again and it’ll guess at the next partial word.

By default that would be the same word every time you give the same input. The only reason it isn’t is because the fuzzy randomized selector is cranked up to max by most providers (temp + seed for randomized selection), but you can turn that back down through the API and get deterministic outputs. That’s not a party trick, that’s the default of the system. If you say the same thing it will output the same single word (token) every time.

You see the aggregate of running it through the stateless algorithm 200+ times before the collection of one-by-one guessed words are sent back to you as a response. I get it, if you think that was put into the glowing orb and it shot back a long coherent response with personality then it must be doing something, but the system truly only outputs one token with zero memory. It’s stateless, meaning nothing internally changed, so there is no memory to remember it wants to complete that thought or sentence. After it outputs “the” the entire thing resets to zero and you start over.

I'm using the Aristotelian definition of my linked article. To understand a concept you have to be able to categorize it correctly. LLMs show strong evidence of this, but it is mostly due to the fact that language itself preserves categorical structure, so when embedded in geometrical space by statistical analysis, it happens to preserve Aristotelian categories.
isn't intellectual reasoning just pattern recognition + a forward causal token generation mechanism?
You can replicate an LLM:

You and a buddy are going to play “next word”, but it’s probably already known by a better name than I made up.

You start with one word, ANY word at all, and say it out loud, then your buddy says the next word in the yet unknown sentence, then it’s back to you for one word. Loop until you hit an end.

Let’s say you start with “You”. Then your buddy says the next word out loud, also whatever they want. Let’s go with “are”. Then back to you for the next word, “smarter” -> “than” -> “you” -> “think.”

Neither of you knew what you were going to say, you only knew what was just said so you picked a reasonable next word. There was no ‘thought’, only next token prediction, and yet magically the final output was coherent. If you want to really get into the LLM simulation game then have a third person provide the first full sentence, then one of you picks up the first word in the next sentence and you two continue from there. As soon as you hit a breaking point the third person injects another full sentence and you two continue the game.

With no idea what either of you are going to say and no clue about what the end result will be, no thought or reasoning at all, it won’t be long before you’re sounding super coherent while explaining thermodynamics. But one of the rounds someone’s going to mess it up, like “gluons” -> “weigh” -> “…more?…” -> “…than…(damnit Gary)…” but you must continue the game and finish the sentence, then sit back and think about how you just hallucinated an answer without thinking, reasoning, understanding, or even knowing what you were saying until it finished.

that's not how llms work. study the transformer architecture. every token is conditioned not just on the previous token, but each layer's activation generates a query over the kv cache of the previous activations, which means that each token's generation has access to any higher order analytical conclusions and observations generated in the past. information is not lost between the tokens like your thought exercise implies.
“The cow goes ‘mooooo’”

“that’s not how cow work. study bovine theory. contraction of expiratory musculature elevates abdominal pressure and reduces thoracic volume, generating positive subglottal pressure…”

Obviously not. In actual thinking, we can generate an idea, evaluate it for internal consistency and consistency with our (generally much more than linguistic, i.e. may include visual imagery and other sensory representations) world models, decide this idea is bad / good, and then explore similar / different ideas. I.e. we can backtrack and form a branching tree of ideas. LLMs cannot backtrack, do not have a world model (or, to the extent they do, this world model is solely based on token patterns), and cannot evaluate consistency beyond (linguistic) semantic similarity.
There's no such thing as a "world model". That is metaphor-driven development from GOFAI, where they'd just make up a concept and assume it existed because they made it up. LLMs are capable of approximating such a thing because they are capable of approximating anything if you train them to do it.

> or, to the extent they do, this world model is solely based on token patterns

Obviously not true because of RL environments.

> There's no such thing as a "world model"

There obviously is in humans. When you visually simulate things or e.g. simulate how food will taste in your mind as you add different seasonings, you are modeling (part of) the world. This is presumably done by having associations in our brain between all the different qualia sequences and other kinds of representations in our mind. I.e. we know we do some visuospatial reasoning tasks using sequences of (imagined) images. Imagery is one aspect of our world model(s).

We know LLMs can't be doing visuospatial reasoning using imagery, because they only work with text tokens. A VLM or other multimodal might be able to do so, but an LLM can't, and so an LLM can't have a visual world model. They might in special cases be able to construct a linguistic model that lets them do some computer vision tasks, but the model will itself still only be using tokenized words.

There are all sorts of other sensory modalities and things that humans use when thinking (i.e. actual logic and reasoning, which goes beyond mere semantics and might include things like logical or other forms of consistency, e.g. consistency with a relevant mental image), and the "world model" concept is supposed, in part, to point to these things that are more than just language and tokens.

> Obviously not true because of RL environments.

Right, AI generally can have much more complex world models than LLMs. An LLM can't even handle e.g. sensor data without significant architectural and training modification (https://news.ycombinator.com/item?id=46948266), at which point, it is no longer an LLM.

> When you visually simulate things or e.g. simulate how food will taste in your mind as you add different seasonings, you are modeling (part of) the world.

Modeling something as an action is not "having a world model". A model is a consistently existing thing, but humans don't construct consistently existing models because it'd be a waste of time. You don't need to know what's in your trash in order to take the trash bags out.

> We know LLMs can't be doing visuospatial reasoning using imagery, because they only work with text tokens.

All frontier LLMs are multimodal to some degree. ChatGPT thinking uses it the most.

"LLMs cannot backtrack". This is exactly wrong. LLMs always see everything in the past. In this sense they are more efficient than turing machines, because (assuming sufficiently large context length) every token sees ALL previous tokens. So, in principle, an LLM could write a bunch of exploratory shit, and then add a "tombstone" "token" that can selectively devalue things within a certain timeframe -- aka just de exploratory thngs (as judged by RoPE time), and thus "backtrack".

I put "token" in quotes because this would obviously not necessarily be an explicit token, but it would have to be learned group of tokens, for example. But who knows, if the thinking models have some weird pseudo-xml delimiters for thinking, it's not crazy to think that an LLM could shove this information in say the closer tag.

> "LLMs cannot backtrack". This is exactly wrong.

If it wasn't clear, I am talking about LLMs in use today, not ultimate capabilities. All commercial models are known (or believed) to be recursively applied transformers without e.g. backspace or "tombstone" tokens, like you are mentioning here.

But yes, absolutely LLMs might someday be able to backtrack, either literally during token generation if we allow e.g. backspace tokens (there was at least one paper that did this) or more broadly at the chain of thought level, with methods like you are mentioning.

a tombstone "token "doesnt have to be an actual token, nor does it have to be explicitly carved out into the tokenizer. it can be learned. unless you have looked into the activations of a SOTA llm you cant categorically say that one (or 80% of one, fir example) doesn't exist.
But that's only true if the system is deterministic?

And in an LLM, the size of the inputs is vast and often hidden from the prompter. It is not something that you have exact control over in the way that you have exact control over the inputs that go into a calculator or into a compiler.

a system that copies its input into the output is capable of any output, no?
That would depend - is the input also capable of anything? If it’s capable of handling any input, and as you said the output will match it, the yes of course it’s capable of any output.

I’m not pulling a fast one here, I’m sure you’d chuckle if you took a moment to rethink your question. “If I had a perfect replicator that could replicate anything, does that mean it can output anything?” Well…yes. Derp-de-derp? ;)

It aligns with my point too. If you had a perfect replicator that can replicate anything, and you know that to be true, then if you weren’t getting gold bars out of it you wouldn’t say “this has nothing to do with the input.”

It doesn't align with your point

My point is that your reasoning is too reductive - completely ignoring the mechanics of the system - and you claim the system is capable of _anything_ if prompted correctly. You wouldn't say the replicator system is capable of the reasoning outlined in the article, right?