| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by AlexCoventry 402 days ago
	It depends on the AI. ChatGPT's higher models (o1-pro/o3/o4-mini-high) have some kind of limited capability to detect errors in the user's thinking, and have relatively few hallucinations.

2 comments

o3 have twice the hallucinations of o1 according to their own hallucination benchmark

I've had fun debates about things like p-zombies with Gemini 2.5 Pro