Hacker News new | ask | show | jobs
by hodgehog11 314 days ago
Not really, it's just that our benchmarks are not good at showing how they've improved. Those that regularly try out LLMs can attest to major improvements in reliability over the past year.