| HN Mirror

This is why it’s important to follow the studies comparing LLMs’ performance in “needle-in-a-haystack” style tasks. They tend to be pretty good at finding the one thing wrong in a large corpus of text, though it depends on the LLM, the flavor (Sonnet, Opus, 8B, 27B, etc) and the size of the corpus, and there are occasional performance cliffs.