Hacker News new | ask | show | jobs
by famouswaffles 1184 days ago
you don't need humans in the loop for alignment. rlaif is a thing and is used for the anthropic models (claude)
1 comments

is it really being used for the final model ? i know they have research papers out on it...but wasnt sure if the production models used it.
Yeah it is.