Hacker News new | ask | show | jobs
by zootreeves 1216 days ago
Shouldn’t the trainers be injecting the expected alignment behaviour into the source text during the pre training. Effectively poisoning their own dataset but with desired behaviour.

You could even have another llm do this.