| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by throwawaymaths 559 days ago
	Completely misses the fact that a big part of the reason why llms hallucinate sp much is because there's a huge innate bias towards producing more tokens over just stopping.

1 comments

TZubiri 559 days ago

The less tokens produced at inference the lower the quality of the response will be.

The process of thinking for an LLM involves the use of words, which is why prompts that ask the LLM to only return the answer will cause lower quality.

link

throwawaymaths 559 days ago

We're not talking about quality, we're talking about accuracy.

In general, a model has to learn to positively say "I don't know" instead of "I don't know" being in the negative space of tokens falling into a weak distribution. The softmax selector also normalizes the token logits, so if no options are any good (all next tokens suck) it could pick randomly from a bunch of bad choices, which then locks the model into a continuation based off of that first bad choice.

link

TZubiri 558 days ago

Well I am talking about quality now as it's a tradeoff.

You can reduce token output to 0 and achieve 100% accuracy too.

link

ausbah 559 days ago

do you know if prompting without regards for length then asking for a summarization of the previous out out works?

link

TZubiri 559 days ago

It does. I think this was used in a gpt4 version, they called it Chain of Thought.

link