Hacker News new | ask | show | jobs
by Workaccount2 269 days ago
Its mostly because it is so damn good with long contexts. It can stay on the ball even at 150k whereas other models really wilt around 50-75k.