Hacker News new | ask | show | jobs
by ThrowawayR2 52 days ago
> "There is probably a whole testing workflow at AI companies to tweak each new model until it "looks" acceptable."

Isn't that what the RLHF phase does ( https://www.paloaltonetworks.com/cyberpedia/what-is-rlhf )?