Hacker News new | ask | show | jobs
by BoorishBears 82 days ago
If Gemini 2.5 Flash and GPT 5.4 perform the same for you, I'm glad.

It's not a useful finding for the rest of the world, and I sure hope non-technical people aren't being taken in by a steaming pile that implies those similarly performing LLMs (and many other ridiculous findings), but c'est la vie.

Now a days anyone can vibecode a "benchmark" with 0 understanding of the domain, what more should I expect?