| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by krackers 183 days ago
	LLMs currently seem to be very myopic in their planning. Current benchmarks that are being targeted such as SWEbench all reward short-term correctness and completeness, without taking into account long-term refactorability. In fact, the two are in a sense at odds with each other: refactoring things sometimes means explicitly _disobeying_ the user prompt to "get things done", and going on a side-quest to clean things up. You could manually prompt the LLM to go out and refactor things, but doing that requires _you_ to read the code and identify places that seem suboptimal.