|
|
|
|
|
by s314
36 days ago
|
|
> Steering seems like a circumventable kludge compared to adjusting the training data directly Correct. Steering is used in mechanistic interpretability studies to prove that your model is correct. There are other better ways to "decensor". |
|