Hacker News new | ask | show | jobs
by est31 1171 days ago
FTR, people are trying to build systems to compare LLMs with each other based on how well they are at saying "I don't know" (of course knowing is still rewarded higher): https://github.com/manyoso/haltt4llm
1 comments

Would be cool to try to incorporate the previous token's confidence embedding into this process, but that would make training with a triangular attention mask not possible.