|
|
|
|
|
by bigfishrunning
180 days ago
|
|
I get that in the age of AI, you didn't want to read the data i linked; that's fine. your ctrl-f search found a reference to 26%. However, on page thirteen, the rate is described as 0.26; I interpreted that as 26% because it's cross referenced in the blog post that i also linked. |
|
On some classes of queries, weak models will hallucinate closer to 100% of the time. One of my favorite informal benchmarks is to throw a metaphorical dart at a map and ask what's special about the smallest town nearby. That's a good trick if you want to observe genuine progress being made in the latest models.
On other tasks, typically the ones that matter, hallucination rates are approaching zero. Not quickly enough for my preference, but the direction is clear enough.