| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nylonstrung 197 days ago
	My experience with deepseek and Kimi is quite the opposite: smarter than benchmarks would imply Whereas the benchmark gains seem by new OpenAI, Grok and Claude models don't feel accompanied by vibe improvement