Hacker News new | ask | show | jobs
by throwawaymaths 557 days ago
We're not talking about quality, we're talking about accuracy.

In general, a model has to learn to positively say "I don't know" instead of "I don't know" being in the negative space of tokens falling into a weak distribution. The softmax selector also normalizes the token logits, so if no options are any good (all next tokens suck) it could pick randomly from a bunch of bad choices, which then locks the model into a continuation based off of that first bad choice.

1 comments

Well I am talking about quality now as it's a tradeoff.

You can reduce token output to 0 and achieve 100% accuracy too.