Hacker News new | ask | show | jobs
by wilbur_whateley 14 days ago
My own experience. I'm working on something complex that's not in the datasets these models were trained on. There I see V4 flash breaking down and hallucinating much more often than GPT/Claude. For normal, common tasks, I also don't see much of a difference.