| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by maeil 725 days ago
	Could you explain how these two are related? That benchmark seems to be asking for very specific information inside a large body of text. For LLMs, that seems quite a different task compared to proving a negative. Any improvements on proving a negative would mean less hallucinations and would be a huge deal.