Hacker News new | ask | show | jobs
by visarga 2021 days ago
You train your LM on web crawl data, but also train a 4chan classifier, then you condition your LM not to generate in 4chan style. GPT-3 got a similar chaperone classifier for offensive speech. It's like knowing swear words but choosing not to use them. You could also condition a general LM to bias and debias its outputs as you like.