|
|
|
|
|
by broast
1076 days ago
|
|
I don't get where this sentiment comes from. I build software specifically on the concept of predictable results from llm's being composable. Sure, the results are not deterministic in that 100% of the time the exact prompt returns the exact same result, but you can tune your prompts so that 100% of the time they give you a valid result in the result category you were seeking, and with a specific probability distribution of available choices. Prompts are functions that can take concrete input and create a probabilistic output that can be automated upon. Especially if you only need to output one token, i.e a number, boolean, word, object reference. And for obvious reasons - the further you forecast out in a sequence the less accurate you will be. As long as you don't change the underlying model, in a massive model with billions of parameters, there are definitely mechanisms and behaviors to discover that you can reason about. |
|
You can't though, that's the issue. Illustrative here are tokens like "SolidGoldMagikarp", but this does happen to "normal" sequences of tokens as well.
There is no filter you can build to keep out such mistakes, any set of otherwise normal tokens could trigger the model to produce wrong output.
Because of how large these models and most prompts are, even slight changes in things like attention can cascade into extremely different results.
there are definitely mechanisms and behaviors to discover that you can reason about.
It's faerie logic. The behaviours are mere trends and observations, not underlaying truth.
The faeries reward you for offering them fruit. But offer them apple which fell from the tree exactly 74 hours ago down to the second and they'll kill you. There is no way to know ahead of time which things will upset them.
The risk here is that you're fooled into believing these systems are understandable, that you know how they work, and that you'll mistakenly use them for something where the wrong results have consequences. You'll stop double-checking the output, all humans are lazy like that, and then you'll have disaster on your hands.