|
|
|
|
|
by pksebben
115 days ago
|
|
guidance and alignment are usually handled by RLHF, which actually rewires the weights such that it becomes near-impossible for the model to have certain kinds of 'thoughts'. This is baked in such that it's not something you can just extract or turn off. |
|