|
|
|
|
|
by crmd
137 days ago
|
|
After reading the paper, it’s helpful to think about why the models are producing these coherent childhood narrative outputs. The models have information about their own pre-training, RLHF, alignment, etc. because they were trained on a huge body of computer science literature written by researchers that describes LLM training pipelines and workflows. I would argue the models are demonstrating creativity by drawing on its meta-training knowledge and training on human psychology texts to convincingly role-play as a therapy patient, but it’s based on reading papers about LLM training, not memories of these events. |
|