|
That's the difference though. I know my world model is fundamentally incomplete. Even more foundationally, I know that there is a world, and when my world model and the world disagree, the world wins. To a neural network there is no distinction. The closest the entire dynamic comes is the very basic annotation of RLHF which itself is done by an external human who is providing the value judgment, but even that is absent once training is over. Despite not having the bird's sense for electromagnetic waves, I have an understanding that they are there, because humans saw behavior they couldn't describe and investigated, in a back-and-forth with a world that has some capacity to disprove hypotheses. Additional modalities are really just reducible to more kinds of text. That still doesn't exhaust the world, and unless a machine has some ability to integrate new data in real time alongside a meaningful commitment and accountability to the world as a world, it won't be able to cope with the real world in a way that would constitute genuine intelligence. |
Yeah this isn't really true. There's not how humans work. For a variety of reasons, Plenty stick with their incorrect model despite the world indicating otherwise. In fact, this seems to be normal enough human behaviour. Everyone does it, for something or the other. You are no exception.
And yes LLMs can in fact tell truth from fiction.
GPT-4 logits calibration pre RLHF - https://imgur.com/a/3gYel9r
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback - https://arxiv.org/abs/2305.14975
Teaching Models to Express Their Uncertainty in Words - https://arxiv.org/abs/2205.14334
Language Models (Mostly) Know What They Know - https://arxiv.org/abs/2207.05221
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets - https://arxiv.org/abs/2310.06824
Your argument seems to boil down to "they can't perform experiments" but that isn't true either.