| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hodgehog11 314 days ago
	Not really, it's just that our benchmarks are not good at showing how they've improved. Those that regularly try out LLMs can attest to major improvements in reliability over the past year.