Hacker News new | ask | show | jobs
by jimmySixDOF 941 days ago
Yes, it seems like this is a direction to replace RLHF so another way to scale without baremetal and if not this then still just a matter of time before some model optimization outperforms the raw epoch/parameters/token approach.