Hacker News new | ask | show | jobs
by pjc50 1716 days ago
> A recent paper[0] has looked into building a testing dataset for language models ability to distinguish truth from falsehood.

Isn't this a massive category error? Truth or falsehood does not reside within any symbol stream but in the interaction of that stream with observable reality. Does nobody in the AI world know Baudrillard?

2 comments

Well, yeah, these models can not interact and observe the world to test the veracity of claims. They are language models and the target for them is text production. No one expects them to understand the universe.

My comment was in response to > GPT-3 is probably a better approach to knowledge processing

and the paper is relevant in that it shows the limitation of current language models in terms of logical consistency or measures of the quality of text sources. GPT-3 and other models are not trained for this and obviously they fail at the task. This is evidence against them being a "better approach to knowledge processing."

Even if we trained future models preferentially on the latest and most cited scientific papers, we will still have issues with conflicting claims and incorrect/fabricated results.

However, that does not mean that it would not be practically useful to figure out a way to include some checks or confidence estimates of truthfulness of model training data and responses. Perhaps just training the models to answer that they don't know when the training data is too conflicted would be useful enough.

It should be at least theoretically possible for an AI to identify contradictions, incoherence and inconsistency in a set of assertions. So, not identifying falsehood per se, but assigning a fairly accurate likelihood score based solely on the internal logic of the symbol stream. In other words, a bullshit detector.
> It should be at least theoretically possible for an AI to identify contradictions, incoherence and inconsistency in a set of assertions

Not in the slightest. Likelihood of veracity is opinion -- laundering it as fact to make some people feel better doesn't make it any more subjective, or authoritative.

I think we're not disagreeing, exactly.

As a simple example, here is a set of assertions:

* The moon is made of green cheese

* The moon is crystalline rock surrounding an iron core

It wouldn't take an AI to see that both of these can't be true, even if we weren't clear about what a moon is made of, exactly.

Some of our common understanding could contain more complicated internal contradictions that might be harder for a human to tease apart, that an AI might be able to identify.

How would the AI know that "made of green cheese" isn't just another way of saying "crystalline rock surrounding an iron core"? To find a contradiction in a statement like "X is A. X is B" it'd first have to be intelligent enough to know when A != B. In your example, that's not as simple as 1 != 0.