Hacker News new | ask | show | jobs
by archiepeach 951 days ago
This does sound like a test that is almost "set up to fail" for an LLM. If the answer is something that most people think they know, but actually don't then it won't pass in an LLM which is essentially a distillation of the common view.