|
|
|
|
|
by famouswaffles
645 days ago
|
|
The speculation here is not about what b64 text is. It's about how the LLM has learnt to process it. Edit: Basically, For all anyone knows, it treats b64 as another language entirely and decoding it is akin in the network to translating French rather than the very simple swapping you've just described. |
|
Complexity builds upon simplicity, and the LLM will begin by noticing the direct (and repeated without variation) predictive relationship between Base64 encoded text and corresponding plain text in the training set. Having learnt this simple way to predict Base64 decoding/encoding, there is simply no mechanism whereby it could change to a more complex "like translating French" way of doing it. Once the training process has discovered that Base64 text decoding can be PERFECTLY predicted by a simple mapping, then the training error will be zero and no more changes (unnecessary complexification) will take place.