| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by alex43578 108 days ago
	And I think human written tests at that. If the LLM is blind to the failure mode X, does it know to reliably write a test to evaluate the behavior of X?