| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by throawayonthe 32 days ago
	well there is https://artificialanalysis.ai/evaluations/omniscience

1 comments

It's a gibberish input detection benchmark, and does not measure output hallucinations.