|
|
|
|
|
by smokel
405 days ago
|
|
Not an expert in the field, but apparently there has been some research into this. It's called inference-time intervention [1], [2]. [1] "Steering Language Models With Activation Engineering", 2023, https://arxiv.org/abs/2308.10248 [2] "Multi-Attribute Steering of Language Models via Targeted Intervention", 2025, https://arxiv.org/pdf/2502.12446 |
|