Hacker News new | ask | show | jobs
by brucethemoose2 996 days ago
This is problematic if you are comparing a model in the same base family as the evaluator, as it will probably favor itself because it literally has the sequences it would naturally emit.