Hacker News new | ask | show | jobs
by simonw 1132 days ago
This article is from March 2023, which in LLM terms is pretty old! The "How many bears have the Russians sent into space?" question still returns hallucinations with ChatGPT 3.5 but, unsurprisingly, gets a correct answer from GPT-4.
1 comments

But is it because GPT-4 is better, or because the owners have hand tuned the public embarrassing examples out by hand?
It's definitely because GPT-4 is better - I've seen this kind of improvement across all kinds of prompts I've tried myself as well.