Hacker News new | ask | show | jobs
by sinenomine 317 days ago
NLL loss and large-batch training regime inherently bias the model to learn “modal” representation of the world, and RLHF additionally collapses enthropy, especially as it is applied at most leading labs.