Hacker News new | ask | show | jobs
by carlossouza 833 days ago
> We observed that all the VLMs tend to be confident while being wrong. Interestingly, we observed that even when the entropy was high, models tried to provide a nonsensical rational, instead of acknowledging their inability to perform the task

It looks like all current models suffer from an incurable case of Dunning–Kruger effect cognitive bias.

All are at the peak of Mount Stupid.

3 comments

LLMs are trained to sound confident.

But they can also only do negation through exhaustion, known unknowns, future unknowns, etc...

That is the pain of the Entscheidungsproblem.

Even in Presburger arithmetic, Natural numbers will addition and equality, which is decidable, still has a double factorial time complexity to prove. That is worse than factorial time for those who've not dealt with it.

Add in multiplication then you are undecidable.

Even if you decided to use the dag like structure of transformers, causality is very very hard.

https://arxiv.org/abs/1412.3076

LLMs only have cheap access to their model probables which aren't ground truth.

So while asking for a pizza recipe could be called out as a potential joke if add a topping that wasn't in its training set, through exhaustion, It can't know when it is wrong in the general case.

That was an intentional choice with statistical learning and why it was called PAC (probably approximately correct) learning.

That was actually a cause of a great rift with the Symbolic camp in the past.

PAC learning is practically computable in far more cases and even the people who work in automated theorem proving don't try to prove no-instances in the general case.

There are lots of useful things we can do in BPP (bounded probabilistically polynomial time) and with random walks.

But unless there are major advancements in math and logic, transformers will have limits.

How can a neural network evaluate "confidence"?

The parameters don't store any information about what inputs were seen in the training data (vs being interpolated) or how accurate the predictions were for those specific inputs.

And even if they did, the training data was usually gathered voraciously, without much preference for quality reasoning.

I don't know for sure, but here's a plausible mechanism for how:

Multiple sub-networks detect the same pattern in different ways, and confidence is the percent of those sub-networks that fire for a particular instance.

There's a ton of overlap and redundancy with so many weights, so there are lots of ways this could work

That’s good. Also maybe an architecture that runs the query through multiple times and then evaluates similarity of responses, then selects (or creates) the most-generated one, along with a confidence level of how many of the individual responses were aligned.
Actually you can get a very good proxy by looking at the probability distrobution of the "answer" tokens. The key here is you have to be able to identify the "answer" tokens.

https://arxiv.org/abs/2402.10200

Phind gives me ChatGPT answers with relatively authoritative references to works on the web that (usually!) support the answer. Could it have a post-filter to fact check against the references?

I guess that is a slight variation of the sibling (@habitue's) answer; both are checks against external material.

I wonder if best resources could be catalogued as the corpus is processed, giving a document vector space to select resources for such 'sense' checking.

IIRC confidence in video is related to predicting what happens next vs what actually happens. If the two seem to correlate to the model it would give it a higher confidence ranking, which would then be used further for self-reinforced learning.
That's not how Dunning-Kruger works. There's never a point where incorrect people were more confident than correct people.

https://en.m.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effec...