Hacker News new | ask | show | jobs
by jimmyed 876 days ago
> This model completes tasks like code generation more thoroughly than the previous preview model and is intended to reduce cases of “laziness” where the model doesn’t complete a task.

How does one solve for this? Wrangling the prompt with "please don't be lazy", or are there inference tricks like running thru the weights differently/multiple times?

2 comments

RLHF harder.
Maybe removing the lazy posts from the training data.