| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by saint_yossarian 307 days ago
	One thing that comes to mind: You still have to verify that the tests are exhaustive, and that the code isn't just gaming specific test scenarios. I guess fuzzing and property-based testing could mitigate this to some extent.

1 comments

ankit219 307 days ago

Yes, we are getting there. I think compiler is a bigger problem than unit tests given most verticals don't even have that. With unit tests, there would be some reward hacking but would be controlled at the model level + tests. (this is one of the reason i dont believe in transformer based llm as a judge for a verifier)

link