Y
Hacker News
new
|
ask
|
show
|
jobs
by
thelastparadise
637 days ago
Can someone explain simply how these benchmarks work?
What exactly is a "failure rate" and how is it computed?
1 comments
quantadev
637 days ago
They simply ask the AI a question about a large document (or set of docs). It either gets the answer right or wrong. They count the number of hits and misses.
link