Hacker News new | ask | show | jobs
by kbyatnal 240 days ago
Yeah that can occasionally work and something we've tested, but it introduces a lot of noise unfortunately and makes systematic evals difficult.