| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ACCount37 105 days ago
	Yeah, that's my point. Humans are not reliable LLM evaluators. "Secret model nerfs" happen in "vibes" far more often than they do in any reality.