Hacker News new | ask | show | jobs
by i_have_an_idea 11 days ago
Is it a frontier player though, or perhaps a new benchmaxxed model? People were saying similar things about Grok but it ultimately amounted to little.
1 comments

"preferred by humans over Sonnet 4.6" makes it pretty clearly not benchmaxxed though.

At least when you define benchmaxxed as "good in benchmarks but not human preference".