Hacker News new | ask | show | jobs
by energy123 401 days ago
o3 have twice the hallucinations of o1 according to their own hallucination benchmark