Y
Hacker News
new
|
ask
|
show
|
jobs
by
herrvogel-
199 days ago
What you describe could also be the difference in the hallucination rate [0]. Opus 4.5 has the lead here and Gemini 3 Pro performs here quite bad compared to the other benchmarks.
[0]
https://artificialanalysis.ai/?omniscience=omniscience-hallu...