Y
Hacker News
new
|
ask
|
show
|
jobs
by
i_have_an_idea
11 days ago
Is it a frontier player though, or perhaps a new benchmaxxed model? People were saying similar things about Grok but it ultimately amounted to little.
1 comments
wasabi991011
11 days ago
"preferred by humans over Sonnet 4.6" makes it pretty clearly not benchmaxxed though.
At least when you define benchmaxxed as "good in benchmarks but not human preference".
link
At least when you define benchmaxxed as "good in benchmarks but not human preference".