| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jimmyed 876 days ago
	> This model completes tasks like code generation more thoroughly than the previous preview model and is intended to reduce cases of “laziness” where the model doesn’t complete a task. How does one solve for this? Wrangling the prompt with "please don't be lazy", or are there inference tricks like running thru the weights differently/multiple times?

2 comments

RLHF harder.

Maybe removing the lazy posts from the training data.