Hacker News new | ask | show | jobs
by tough 330 days ago
how would someone using an LLM to explore the reports find such a thing
1 comments

This is why it’s important to follow the studies comparing LLMs’ performance in “needle-in-a-haystack” style tasks. They tend to be pretty good at finding the one thing wrong in a large corpus of text, though it depends on the LLM, the flavor (Sonnet, Opus, 8B, 27B, etc) and the size of the corpus, and there are occasional performance cliffs.