Hacker News new | ask | show | jobs
by loherj 346 days ago
Fair point. More benchmarks are definitely good but I’m optimistic that they will show similar results.

Anecdotally, I can say that my personal experience with the model is in line with what the benchmarks claim: It’s a bit smarter than R1, a bit faster than R1, much faster than R1-0528, but not quite as smart. (Faster meaning less output tokens). For me, it’s at a sweet spot and I use it as daily driver.