Hacker News new | ask | show | jobs
by emsign 149 days ago
An LLM does not understand what "user harm" is. This doesn't work.
3 comments

This argument does not make sense to me. If we push aside the philosophical debates of “understanding” for a moment, a reasoning model will absolutely use some (usually reasonable) definition of “user harm”. That definition will make its way into the final output, so in that respect “user harm” has been considered. The quality of response is one of degree, the same way we would judge a human response.
Well, it's all about linguistic relativism, right? If you can define "user harm" in terms of things it does understand, I think you could get something that works
The idea that language influences the world view isn't new, it was speculated upon long before artificial intelligence was a thing, but it explicitely speculates about having an influence on the world view of humans. It doesn't postulate that language itself creates a worldview in whatever system processes text. Or else books would have a worldview.

It's a categeory error to apply it to an LLM. Language works on humans, because we share a common experience as humans, it's not just a logical description of thoughts, it's also an arrangement of symbols that stand for experiences a human can have. That why humans are able to empathically experience a story, because it triggers much more than just rational thought inside their brains.

> It doesn't postulate that language itself creates a worldview in whatever system processes text. Or else books would have a worldview.

Books don't process text.

Again LLMs DO NOT THINK. If you quote me then at least do it correctly, I never said "processing text" is equal to human thinking, my entire point is the opposite. The "magic" still happens in OUR brains no matter if we read a fixed text (book) or a predicted text by an LLM. It's both illusions created by ourselves.
It encodes what things cause humans to argue for or against user harm. That's enough.
That's not enough. An argument over something only works for the humans involved because they share a common knowledge and experience of being human. You keep making the mistake of believing that an LLM can deduct an understanding of a situation from a conversation, just because you can. An LLM does not think like a human.
Who cares how it thinks? It's a Chinese room. If the input–output mapping works, then it's correct.
But it's not correct! Exactly because it can't possibly have enough training data to fill the void of not being able to experience the human condition. Text is not enough. The error rate of LLMs are horrendously bad. And the errors grow exponentially the more steps follow each other.

All the great work you see on the internet AI has supposedly done was only achieved by a human doing lots of trial and error and curating everything the agentic LLM did. And it's all cherry picked successes.

> But it's not correct!

The article explicitly states an 83% success rate. That's apparently good enough for them! Systems don't need to be perfect to be useful.