|
|
|
|
|
by TerrifiedMouse
972 days ago
|
|
> Is this surprising? Can you point to researchers in the field being “surprised” by LLMs returning sound answers? It's surprising because it wasn't the intent of LLMs. LLMs are just predictive models that guess the most likely next word. Having the results make sense was never a priority. Early version, GPT1/2, all return mostly complete nonsense. It was only with GPT3 when the model got large enough that it started returning results that are convincing and might even make sense often enough. Even more mind boggling is the fact that randomness is part of its algorithm, i.e. temperature, and that without it the output is kind of meh. |
|
If you took the same amount of data for the GPT3+ but scrambled it's tokenization before training THEN I would agree with you that its current behaviour is surprising, but the model was fed data that has large swaths that are literal question and answer constructions. It's over fitting behavior is largely why it's parent company is facing so much legal backlash.
> Even more mind boggling is the fact that randomness is part of its algorithm
The randomness is for token choice rather than any training time tunable so fails to support the "i.e. we don’t really know what happened" sentiment. We do know, we told it to flip a coin, and it did.
> i.e. temperature, and that without it the output is kind of meh.
Both without it and with it. You can turn up the temperature and get bad results as well as you can turn it down and get bad results.
If adding a single additional dimension to the polynomial of the solution space turned a nondeterministic problem into a deterministic one, then yes, I would agree with you, that would be surprising.