Hacker News new | ask | show | jobs
by Eisenstein 476 days ago
How would you calculate the confidence? LLMs are notoriously bad at grading their own output.