Hacker News new | ask | show | jobs
by simonw 322 days ago
Hah, yeah I'd love to know if OpenAI ran evals that were fine-grained enough to prove to themselves that putting that bit in capitals made a meaningful difference in how likely the LLM was to just provide the homework answer!