Hacker News new | ask | show | jobs
by posting_mess 629 days ago
A more concise version of my question is;

"How do we train large ML/AI systems to think generating the unholy is bad without hurting intelligence, given we know that applying some universal law (I.E RLHF) hurts the model".

Trying to promote the exact opposite of "let them eat the unholy".