Yeah... and I'm kind of suspicious of the whole "without changing the weights" deal, because adding working context to the model, like telling it the algorithm for adding numbers really sounds like there's some model state that's getting mutated, even if it's not stored in a file called weights.dat or whatev.
I meant shocking in the sense that it makes me gape in awe, but as I wrote, it's also, simultaneously, completely unsurprising given all the new emergent capabilities we keep discovering. We're in agreement :-)
This is such a circular thing, that I feel like it is amazing to see it.
The reason LLMs use a NN is because they're trying to encode a probability function for generating the passage.
And now, you are encoding another n-gram follower exercise (i.e 1+1 = 2) on top of it :)