| HN Mirror

Correct. Eric Hartford's blog post delves into the alignment of open-source LLMs[1]. In essence, models like LLaMA and GPT-Neo-X adopt alignment behaviors from ChatGPT-sourced instruction datasets. To achieve more transparent model responses, one can refine the dataset by removing biases and refusals, then retrain.

[1] https://erichartford.com/uncensored-models#heading-ok-so-if-...