Hacker News new | ask | show | jobs
by r0uv3n 1184 days ago
AFAIK the paper mentioned that RLHF mostly decreased its capabilities. It seems more likely to me that just longer normal training was one of the main reason for the increased capabilities.