Y
Hacker News
new
|
ask
|
show
|
jobs
by
throawayonthe
32 days ago
well there is
https://artificialanalysis.ai/evaluations/omniscience
1 comments
goldenarm
32 days ago
It's a gibberish input detection benchmark, and does not measure output hallucinations.
link