|
|
|
|
|
by treyd
899 days ago
|
|
If I had to guess, single characters are able to be encoded as tokens, but there's more "bandwidth" in the model being dedicated to handling them and there's less semantic meaning encoded in them "natively" compared to tokens for concrete words. If it decides to, it can recreate unknown sequences by copying over the tokens for the single letters or create them if it makes sense. |
|
It still baffles me why such stochastic parrot / next token predictor, will recognize these "Unseen combinations of tokens" and reuse them in response.