|
|
|
|
|
by NiloCK
47 days ago
|
|
Are the training arenas for the Activation Verbalizer and Activation Reconstructor models well described here? If they are co-trained only on activationWeights->readibleText->activationWeights without visibility into the actual stream of text that the probe-target LLM is processessing, then it seems unlikely that the derived text can both be on-topic and also unrelated to the "actual thoughts" in the activationWeights. |
|