Hacker News new | ask | show | jobs
by farleykr 490 days ago
I might be getting overly philosophical here but I'd say it's because they truly don't know anything at all (as opposed to knowing some things but not others). To be able to say "I don't know" you have to first "know" on a deeper level that there is a fundamental true or correct answer to a question and that you are disconnected from it.
5 comments

Well, yes. "AI" skips over all the difficulties and contradictions of philosophy, all the challenges of working out what it means to know something, things like "justified true belief" and so on. It 'just' (!) uses a probabilistic model to emit strings of text. It's basically a super-pundit. It can predict conventional wisdom really well.

But fundamentally it's trapped in the wrong side of a glass jar. It can't kick stones like Samuel Johnson. https://en.wikipedia.org/wiki/Appeal_to_the_stone

True, no argument there. What fascinates me more is why people continue to think we can teach a chatbot how to recognize what's true and give us answers that we can't find for ourselves. At best a chatbot is going to be a tool that enables us to gain insights we didn't have before the same way a dictionary can "teach" you words you didn't know before.

I think the idea of using technology to solve life's ultimate conundrums has long since jumped the shark and veered into the area of religious belief. People are literally putting their faith in AI even if they wouldn't use religious vocabulary to label and define it as such.

I think some of this is the ultimate logical conclusion of postmodernism/deconstruction. In the early 20th century people were a lot more confident about the possibility of finding absolute truth; at the end of it this had completely dissipated into uncertainty and relativism.

The "Sokal Hoax" was a 90s experiment in which a physicist created a fake paper and submitted it to a cultural studies journal. He did not base his paper on anything he would have considered "true", rather on a desire to look as much like a valid text as possible. This is a simplified version of how the LLM training/scoring process works. Nowadays everywhere is having to deal with the same kind of thing done by LLM users. It's the perfect technology for non-rigorous academia.

I don't think it's overly philosophical to point out that these are large language models, not truth engines or AGI or knowledge directories. They're not using logic to reason their way to an answer. They're just predicting the next word that would sound like part of a human answer.
Fair enough. I think a lot of people are going to end up blindly trusting AI because its right often enough. But for those who are interested in what it really means to know something, I wonder if this will push people back towards embracing the idea that there is fundamental, objective, knowable truth at the core of the universe even if we can't ever know that truth perfectly.
> They're not using logic to reason their way to an answer. They're just predicting the next word that would sound like part of a human answer.

OpenAI claims recent models are actually reasoning to some extent.

They're just outputting tokens that resemble a reasoning process. The underlying tech is still the same LLM it always has been.

I can't deny that doing it that way improves results, but any model could do the same thing if you add extra prompts to encourage the reasoning process, then use that as context for the final solution. People discovered that trick before "reasoning" models became the hot thing. It's the "Work it out step by step" trick but in a dedicated fine-tune.

> They're just outputting tokens that resemble a reasoning process.

Looking at one such process of emulating reasoning (got deepseek-70B locally), I'm starting to wonder how does that differ from actual reasoning? We "think" about something, may make errors in that thinking, look for things that don't make sense and correct ourselves. That "think" step is still a blackbox.

I asked that llm a typical question of gas exchange between containers, it made some errors and noticed some calculations that didn't make sense:

> Moles left A: ~0.0021 mol

> Moles entered B: ~0.008 mol

> But 0.0021 +0.008=0.0101 mol, which doesn't make sense because that would imply a net increase of moles in the system.

Well, that's totally invalid calculation, it should be "-" in there. It also noticed that those quantities should be same in other place.

Eventually, after 102 minutes and 10141 tokens, involving checking answers from different angles multiple times, it outputted approximately correct response.

Only by conveniently redefining the word.

Instead, they predict the next tokens of a "think out loud" example, and wrap it up with a "conclusion and summary" example.

It doesn't know why this writing pattern is the semantic space it is exploring: it has simply been set up to do so in the first place.

Does it matter if it doesn't know why this particular pattern is suitable? Also, do you always ask yourself why you use that particular pattern all the time, or do you just use them?
It seems like you are implying that I don't think before I speak. Maybe that is sometimes the case, but I would venture to say, "not usually, and certainly not always."

The point I'm making here is that all of these observations are made after-the-fact. We humans see five different categories of output:

1. "I do know X" where X is indeed correct information

2. "I do know X" where X is false information or nonsense

3. "I don't know" when it really doesn't

4. "I don't know" when a slightly different prompt would lead to option #1

5. Output that is not phrased as a direct answer to a question.

The article introduced #2 as "hallucinations". I introduced #4 in my previous comment (and just now #5), and propose that all five are hallucinations.

As far as the LLM is concerned, there is only one category of output: the most likely next token. Which of the five that will be is determined by the examples present in the training corpus, which are later weighed during training.

Logic is not present in the process. It is only present in the result.

> It seems like you are implying that I don't think before I speak.

I'm implying that most times you don't think before you think or after you think (you or me typically don't meta-think).

I'm saying that very often I (and looks like a lot of people around me) don't think much before I speak. I have internal monologue when I'm "thinking something out", but I typically don't think things through when I'm speaking with people in day-to-day conversations, only when I encounter a problem I didn't see yet and I'm not "trained" in solving it. Maybe some people can make fully reasoned sentences in split seconds before they start talking, but not me. IIRC those two modes of thinking are called slow and fast thinking.

> Logic is not present in the process. It is only present in the result.

I'm talking about that process. Have you seen "thinking" part of current reasoning LLM's? It does indeed look like a process of using logic. After "thinking" part, there is "output" part that makes conclusions form the process of thinking. Recently I asked local version of deepseek about a gas exchange problem and it thought a lot about this, making some small mistakes in logic, correcting them, ultimately returning approximately valid result. It even made some small errors in calculations and corrected itself by multiplying parts of numbers and adding them for correct result. I've put that example online[1] if you'd like to read it, it's pretty interesting.

[1] https://pastebin.com/mXyLGCGQ

They are machines designed to produce a facsimile of knowledge. Or at least an approximation. If they refused to answer, that's a failure by the terms of what they product aims to do
You are actually getting overly philosophical. The reason is that a step of chatbot training is to fine tune the base model to less frequently respond with non answers. I have not read the article, but the answer is: Because they are specifically built not to. Its like asking why so few salesmen end a call with "yeah it seems our product is not the right solution for you"
Only person who understands this is a style choice and not a deep limitation of the robot condition.

If a politician has non-answers for difficult questions, does that mean they aren't conscious? If a student writes crap for a test question, aiming for partial marks, were they raised wrong?

so our "knowledge" is based on perception. I know something happened because I saw it with my own 2 eyes. everything else is less "knowable"
That seems to be the conspiracists group think, yea.

IRL we invented the field of science to avoid such make belief nonsense.

Instead, we believe that those scientists did provide correct answers. Which is true about 99.9% of time, but not 100%. There is still believing involved, because we can't really check out everything for ourselves, there is no time in one person's life for this.