Hacker News new | ask | show | jobs
by danielmarkbruce 695 days ago
Using gpt-4o?
1 comments

Asking it “3.11和3.8哪个大” (meaning “Which one is larger, 3.11 or 3.8?” in Chinese) and it answers 3.11 more than half of the time. I assume it’s because Python 3.11 is larger than Python 3.8. While it does work in its native language English, this failure doesn’t give me much confidence in its reliability, as we don’t know why it works in one language but not the other yet.
Uh, I think people have a good idea why it doesn't work as well in other languages.