Hacker News new | ask | show | jobs
by kuberwastaken 423 days ago
From all my testing, this never really happened even once honestly, plus the judge model (that I've kept strictly a reasoning model) also evaluates individually before "judging" the consensus.