Hacker News new | ask | show | jobs
by drbig 98 days ago
The most interesting is the realization that if the LLM's input is only the output of a professional (human), then by definition the LLM cannot mimic the process the (human) professional applied to get from whatever input they had to produce the output.

In other words an LLM can spit out a plausible "output of X", however it cannot encode the process that lead X to transform their inputs into their output.

5 comments

LLMs obviously aren't reproducing the internal cognitive process, but they might still capture some of the structural patterns that emerge from it
Interestingly, there is some neuroscience research that transformer architecture resembles "cue based retrieval" in the human brain in some important ways.

https://www.sciencedirect.com/science/article/pii/S0749596X2...

i don't get what the point of what you are saying is? i can ask it to explain how to solve an integral right now with steps.

i can ask it to tell me how to write like a person X right now.

"Explain how to solve" and "write like X" are crucially different tasks. One of them is about going through the steps of a process, and the other is about mimicking the result of a process.
Neural networks most certainly go through a process to transform input into output (even to mimic the results of another process) but it's a very different one from human neutral networks. But I think this is the crucial point of the debate, essentially unchanged from Searle's "Chinese Room" argument from decades ago.

The person in that room, looking up a dictionary with Chinese phrases and patterns, certainly follows a process, but it's easy to dismiss the notion that the person understands Chinese. But the question is if you zoom out, is the room itself intelligent because it is following a process, even if it's just a bunch of pattern recognition?

but llm can do both. so what's the point?

can you give a specific example of what an llm can't do? be specific so we can test it.

like OP originally said, the LLM doesn't have access to the actual process of the author, only the completed/refined output.

Not sure why you need a concrete example to "test", but just think about the fact that the LLM has no idea how a writer brainstorms, re-iterates on their work, or even comes up with the ideas in the first place.

> has no idea how a writer brainstorms

This isn't true in general, and not even true in many specific cases, because a great deal of writers have described the process of writing in detail and all of that is in their training data. Claude and chatgpt very much know how novels are written, and you can go into claude code and tell it you want to write a novel and it'll walk you through quite a lot of it -- worldbuilding, characters, plotting, timelines, etc.

It's very true that LLMs are not good at "ideas" to begin with, though.

Professional writer here. On our longer work, we go through multiple iterations, with lots of teardowns and recalibrations based on feedback from early, private readers, professional editors, pop culture -- and who knows. You won't find very clear explanations of how this happens, even in writers' attempts to explain their craft. We don't systematize it, and unless we keep detailed in-process logs (doubtful), we can't even reconstruct it.

It's certainly possible to mimic many aspects of a notable writer's published style. ("Bad Hemingway" contests have been a jokey delight for decades.) But on the sliding scale of ingenious-to-obnoxious uses for AI, this Grammarly/Superhuman idea feels uniquely misguided.

The distinction being made is the difference between intellectual knowledge and experience, not originality.

Imagine a interviewing a particularly diligent new grad. They've memorized every textbook and best practices book they can find. Will that alone make them a senior+ developer, or do they need a few years learning all the ways reality is more complicated than the curriculum?

LLMs aren't even to that level yet.

> because a great deal of writers have described the process of writing in detail

And that's often inaccurate - just as much as asking startup founders how they came to be.

Part of it is forgot, part of it is don't know how to describe it and part of it is don't want to tell you so.

why not? datasets are not only finished works, there's datasets that go into the process they're just available in smaller quantities
Let's take the work of Raymond Carver as just one example. He would type drafts which would go through repeated iteration with a massive amount of hand-written markup, revision and excision by his editor.

To really recreate his writing style, you would need the notes he started with for himself, the drafts that never even made it to his editor, the drafts that did make to the editor, all the edits made, and the final product, all properly sequenced and encoded as data.

In theory, one could munge this data and train an LLM and it would probably get significantly better at writing terse prose where there are actually coherent, deep things going on in the underlying story (more generally, this is complicated by the fact that many authors intentionally destroy notes so their work can stand on its own--and this gives them another reason to do so). But until that's done, you're going to get LLMs replicating style without the deep cohesion that makes such writing rewarding to read.

i don't buy this logic. if i have studied an author greatly i will be able to recognise patterns and be able to write like them.

ex: i read a lot of shakespeare, understand patterns, understand where he came from, his biography and i will be able to write like him. why is it different for an LLM?

i again don't get what the point is?

You will produce output that emulates the patters of Shakespeare's works, but you won't arrive at them by the same process Shakespeare did. You are subject to similar limitations as the llm in this case, just to a lesser degree (you share some 'human experience' with the author, and might be able to reason about their though process from biographies and such)

As another example, I can write a story about hobbits and elves in a LotR world with a style that approximates Tolkien. But it won't be colored by my first-hand WW1 experiences, and won't be written with the intention of creating a world that gives my conlangs cultural context, or the intention of making a bedtime story for my kids. I will never be able to write what Tolkien would have written because I'm not Tolkien, and do not see the world as Tolkien saw it. I don't even like designing languages

You are not able to write like Shakespeare. Shakespeare isn't really even a great example of an "author" per se. Like anybody else you could get away with: "well I read a lot of Bukowski and can do a passable imitation" or "I'm a Steinbeck scholar and here's a description of his style." But not Shakespeare.

I get that you're into AI products and ok, fine. But no you have not "studied [Shakespeare] greatly" nor are you "able to write like [Shakespeare]." That's the one historical entity that you should not have chosen for this conversation.

This bot is likely just regurgitating bits from the non-fiction writing of authors like an animatronic robot in the Hall of Presidents. Literally nobody would know if the LLM was doing even a passable job of Truman Capote-ing its way through their half-written attempt at NaNoWriMo

You can understand his biography and analyses about how shakespeare might have written. You can apply this knowledge to modify your writing process.

The LLM does not model text at this meta-level. It can only use those texts as examples, it cannot apply what is written there to it's generation process.

>> i again don't get what the point is?

The point is that you dont become Jimi Hendrix or Eric Clapton even if you spend 20 years playing on a cover band. You can play the style, sound like but you wont create their next album.

Not being Jimi Hendrix or Eric Clapton is the context you are missing. LLMs are Cover Bands...

This is the plot of a short story of Borges’ called “Pierre Menard, the Author of Don Quixote.”
Actually this is the crux and the nuance which makes discussing LLM specifics a pain in the general space.

If you built an LLM exclusively on the writings and letters of John Steinbeck, you could NOT tell the LLM to solve an integral for you amd expect it to be right.

Instead what you will receive is a text that follows a statistically derived most likely (in accordance to the perplexity tuning) response to such a question.

> If you built an LLM exclusively on the writings and letters of John Steinbeck, you could NOT tell the LLM to solve an integral for you amd expect it to be right.

Isn't this obvious? There is not enough latent knowledge of math there to enable current LLMs to approximate anything resembling an integral.

Its obvious to me.

Its obvious to you.

It isnt obvious to the person I am responding to, and it isnt obvious to majority of individuals I speak with on the matter (which is why AI, personally, is in the bucket of religion amd politics for polite conversation to simply avoid)

Wait -- I'm fairly certain this is obvious to the person you were responding to. It may not be obvious to a lay person (who may not even know LLMs are trained at all). But I think this is obvious to almost all people with even a small understanding of LLMs.
I'm actually pretty convinced they're a troll or at the very least a high confrontation participant who is quick to move goal posts, ignore entire chains of logic, engage in ad hominim attacks of other posters, and is bringing zero novel insight anywhere in this thread
It’s obvious to me. What point are you trying to make? It’s not religion it’s falsifiable easily.

LLMs can reason about integrals as well as in a literature context. You suggested that if it’s not trained on literature then it can’t reason about it. But why does that matter?

Now what if we ask the LLM to write about social media? Do you think the output would be similar to what you'd get if we had a time machine to bring the actual man back and have him form his own thoughts firsthand?
It may be stylistically similar, but it's impossible to predict what the content would be.
>If you built an LLM exclusively on the writings and letters of John Steinbeck, you could NOT tell the LLM to solve an integral for you amd expect it to be right.

this shows that you have very less idea on how llm's work.

LLM that is trained only on john steinbeck will not work at all. it simply does not have the generalised reasoning ability. it necessarily needs inputs from every source possible including programming and maths.

You have completely ignored that LLMs have _generalised_ reasoning ability that it derives from disparate sources.

LLMs have the ability to convince you that they've "reasoned". sometimes, an application will loop the output of an LLM to its input to provide a "chain of reasoning"

This is not the same thing as reasoning.

LLMs are pattern matchers. If you trained an llm only to map some input to the output of John Steinbeck, then by golly that's what it'll be able to do. If you give it some input that isn't suitably like any of the input you gave it during training, then you'll get some unpredictable nonsense as output.

this is outdated stuff from 3 years ago.

> If you trained an llm only to map some input to the output of John Steinbeck

this is literally not possible because the llm does not get generalised reasoning ability. this is not a useful hypothetical because such an llm will simply not work. why do you think you have never seen a domain specific model ever?

if you wanted to falsify this claim: "llm's cant reason" how would one do that? can you come up with some examples that shows that it can't reason? what if we come up with a new board game with some rules and see if it can beat a human at it. just feed the rules of the game to it and nothing else.

here is gpt-5.4 solving never before seen mathematics problems: https://epochai.substack.com/p/gpt-54-set-a-new-record-on-fr...

you could again say its just pattern matching but then i would argue that its the same thing we are doing.

Domain specific LLM's absolutely exist, don't assume i've never seen one. You seem very misinformed on what is "literally not possible".

https://www.ibm.com/think/topics/domain-specific-llm

Is the reason it can show steps for solving an integral because the training set contained webpages or books showing how to do it?
if we have steps for understanding any author's english and creative process (generally not specific to an author) would you agree then it is possible for an llm to do it?
The real sticking point for me is I don't even believe that authors themselves FULLY understand their process. The idea that anybody could achieve such full introspection as to understand and articulate every little thing that influences their output seems astoundingly improbable.
Repeating a process, yes for sure, even (pseudorandom?) variations on a process. Understanding a process is a different question, and I’m not sure how you would measure that.

In school we would have a test with various questions to show you understand the concept of addition, for example. But while my calculator can perfectly add any numbers up to its memory limit, it has no understanding of addition.

> while my calculator can perfectly add any numbers up to its memory limit, it has no understanding of addition.

"my calculator can perfectly add any numbers up to its memory limit" This kind of anthropomorphic language is misleading in these conversations. Your calculator isn't an agent so it should not be expected to be capable of any cognition.

It’s the degree of generalisability. And LLMs do have understanding. You can ask it how it came up with the process in natural language and it can help - something a calculator can’t do.
> And LLMs do have understanding.

They absolutely do not. If you "ask it how it came up with the process in natural language" with some input, it will produce an output that follows, because of the statistics encoded in the model. That output may or may not be helpful, but it is likely to be stylistically plausible. An LLM does not think or understand; it is merely a statistical model (that's what the M stands for!)

“i can ask it to give a text description of a linear logical math process that has been described in text countless times”

If you think “the tacit knowledge and conscious/subconscious reasoning mix that caused X to write like X” can be meaningfully captured by some 1-page “style guide” like llmtropes, I’m not sure what to tell you. Such a style description would be informed by a soup of reviewers that most certainly cannot write like X even with their stronger and more nuanced observations than what the LLM picked up.

Is it not possible for the process of input to output be inferred by the llm and therefore applied to new inputs to create appropriate outputs.
Only if the LLM knows the inputs connected to particular outputs, pre-digital era or classified material might not be available, neither informal discussions with other experts.

Most importantly, negative but unused signals might not be available if the text does not mention it.

challenge: provide a single example where the LLM can only provide the output and not the steps? (in text scenario)
An LLM can always output steps, but it doesn’t mean they are true, they are great at making up bullshit.

When the “how many ‘r’ in ‘strawberry’” question was all the rage, you could definitely get LLMs to explain the steps of counting, too. It was still wrong.

can you provide a single example now with gpt 5.4 thinking that makes up things in steps? lets try to reproduce it.
I’m pretty sure you can think of one yourself, I’m not going to play this game. Now it’s 5.4 Thinking, before that it was 5.3, before that 5.2, 5.1, 5, before that it was 4… At every stage there’s someone saying “oh, the previous model doesn’t matter, the current one is where it’s at”. And when it’s shown the current model can’t do something, there’s always some other excuse. It’s a profoundly bad faith argument, the very definition of moving the goal posts.

I do have a number of examples to give you, but I no longer share those online so they aren’t caught and gamed. Now I share them strictly in person.

You've pinpointed the connection that people fail to make when they seek legal advice (or even information) from LLMs.
what prevents the input from being keystrokes and screen recordings of thousands of lawyers solving cases?
This makes the same error, or a related one. That input is not the lawyer's internal expert process, only the intermediate or (near-) final outcome of it.
Replace "LLM" with "student" and read that again. You don't just blindly give students output, you teach them, like what you are supposed to do with an LLM.
If you change the words in a sentence then it changes its meaning.
Yeah but obviously my point in this context is that it doesn't. Its not like I said to replace the word with "potato". Thanks for your genius comment.
It changes the meaning significantly. An LLM has very little in common with a human student.
How so? I see lots of similarities both in the training and inference/prompting stages.
There are some similarities, but they are absolutely overwhelmed by the differences. Having a handful of superficial similarities is not enough to make draw a meaningful comparison. The act of teaching a human is very different from “training” an LLM because humans have the power of the whole brain and body, not just some information-integration part that the brain and LLMs may (or may not) share. Humans can be creative in ways that LLMs manifestly can’t be. Humans can act like mere token predictors, but we can (and routinely do) also transcend that, question it, play with it. LLMs can’t.
You can't "teach" an LLM. It can't think. It's a simple pattern-matching algorithm, basically just an Eliza bot with a huge table of phrases.
You're not thinking, just regurgitating catch phrases that are factually incorrect hallucinations. So how are you any better than an LLM?
Which part is "factually incorrect"?
Several parts of your claim are incorrect.

First, modern LLMs are not "a huge table of phrases". They are neural networks with billions of learned parameters that generate tokens by computing probability distributions over vocabulary given prior context. There is no lookup table of stored sentences.

Second, Eliza-style bots used explicit scripted pattern matching rules. LLMs instead learn statistical representations from large corpora and can generalize to produce novel sequences that were never present in the training data.

Kent Pitman's Lisp Eliza from MIT-AI's ITS History Project (sites.google.com):

https://news.ycombinator.com/item?id=39373567

https://sites.google.com/view/elizagen-org/

https://sites.google.com/view/elizagen-org/original-eliza

Third, while "pattern matching" is sometimes used informally, it’s misleading technically. Transformers perform high-dimensional vector computations and attention over context to model relationships between tokens. That’s very different from rule-based pattern matching.

You can certainly debate whether LLMs "think", but describing them as "Eliza with a big phrase table" is not an accurate description of how they work.

You have the resources available at your fingertips to learn what the truth is, how LLMs actually work. You could start with Wikipedia, or read Steven Wolfram's article, or simply ask an LLM to explain how it works to you. It's quite good at that, while an Eliza bot certainly can't explain to you how it works, or even write code.

What Is ChatGPT Doing … and Why Does It Work?

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...

Enough with this analgoy. It's flawed on so many levels. First and foremost, stop devaluing humanitiy and hyping up AI companies by parroting their party line. Second, LLMs don't learn. They can hold a very limited amount of context, as you know. And every time you need to start over. So fuck no, "teaching" and LLM is nothing like teaching an actual human.
It all went south when we started to call it "learning" instead of "fitting parameters".
„Fitting“ is still too nice of a word choice, because it implies that it’s easy to identify the best solution.

I suggest „randomly adjusting parameters while trying to make things better“ as that accurately reflects the „precision“ that goes into stuffing LLMs with more data.

It was called learning already back when the field was called cybernetics and foundational figures like Shannon worked on this kind of stuff. People tried to decipher learning in the nervous system and implement the extracted principles in machines. Such as Hebbian learning, the Perception algorithm etc. This stuff goes back to the 40s/50s/60s, so things must have gone south pretty early then.
I agree with ya so much. I have seen so many people even in hackernews somehow give human qualities to LLM's.

This Grammarly thing seems to be a bastardized form of that not even sparing the dead.

I'd say that there was some incentive by the AI companies to muddle up the water here.

> very limited amount of context

This isn't 2023 anymore

absolutely they can learn. you are being emotional and the original point is correct.

i give the LLM my codebase and it indeed learns about it and can answer questions.

That isn't learning, it can read things in its context, and generate materials to assist answering further prompts but that doesn't change the model weights. It is just updating the context.

Unless you are actually fine tuning models, in which case sure, learning is taking place.

i don't know why you think it matters how it works internally. whether it changes its weights or not is not important. does it behave like a person who learns a thing? yes.

if i showed a human a codebase and asked them questions with good answers - yes i would say the human learned it. the analogy breaks at a point because of limited context but learning is a good enough word.

Maybe because I work on a legacy programming language with far less material in the training? For me it makes a difference because it partly needs to "learn" the language itself and have that in the context, along with codebase specific stuff. For something with the model already knowing the language and only needing codebase specific stuff it might feel different.