Hacker News new | ask | show | jobs
by ben_w 383 days ago
I'd say that's too short.

> But it’s not just Claude. Theo Browne put together a new benchmark called SnitchBench, inspired by the Claude 4 System Card.

> It turns out nearly all of the models do the same thing.

1 comments

I totally agree, but I needed you to post the other half because of TL;DR…