|
|
|
|
|
by Kiro
1208 days ago
|
|
When you say it's trivial to encode text in neural networks, what does that mean for LLMs? What makes it decide to encode certain texts or not? Isn't it just one big network of neurons? The prompt I've seen for it to verbatim reproduce the fast inverse square root from Quake was: // fast inverse square root
float Q_
When I ask ChatGPT to give me code for a fast inverse square root it doesn't reproduce it at all but gives me an implementation that looks completely different.So, my original thought was that the prompt above with the characteristic Quake III Q_ naming is enough to push it into a corner where the path is reduced to just one possibility (with that path being the words in the code itself) and not that it merely copypasted the code from an encoded version of it. I.e. it still predicts it word-by-word but with only one possible way for each step. This is just be my naive take on it though but I really want to understand. |
|
This is what people mean when they say it copy pastes things. It doesn't literally go to the source code, press ctrl-c and then ctrl-v that to you, nobody believes it did.
And the model does this a lot, as I said the reason it doesn't do that all the time is that they train it not to. And the quake code example got such a big deal that they started to hard code it to not return that, but that doesn't mean it never does that for other things, just that this particular example is now "fixed".