| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by muzani 277 days ago
	Reproduction I suppose. I would like the same things as OP too. LLM outputs are qualitative; they can't really be automatically scored and prompt enhancements tend to multiply the bug. It can solve a problem, but introduce a new one. It's practical just to do it manually.