|
|
|
|
|
by BoorishBears
82 days ago
|
|
If Gemini 2.5 Flash and GPT 5.4 perform the same for you, I'm glad. It's not a useful finding for the rest of the world, and I sure hope non-technical people aren't being taken in by a steaming pile that implies those similarly performing LLMs (and many other ridiculous findings), but c'est la vie. Now a days anyone can vibecode a "benchmark" with 0 understanding of the domain, what more should I expect? |
|