Hacker News new | ask | show | jobs
by code_biologist 389 days ago
I think your parent comment is citing that as an example of why livebench is no longer a good benchmark. That said, the new Flash is very good for what it is, and IMO after the Pro 05-06 nerfs the two models are much closer in performance for many tasks than they really should be — Pro should be / was way better (RIP 03-25 release). That livebench result may be wrong about the specific ranking, but I think it's right that Flash is in the same class of coding strength as Sonnet 3.7.
1 comments

Thanks, that's very informative.

My ignorance is showing here: why is the Pro 05-06 a nerf?