|
|
|
|
|
by kgeist
943 days ago
|
|
Just yesterday I saw an example of a person asking GPT what "fluftable" means. The word was invented by their little daughter and they didn't know what it meant. GPT reasoned it was a portmaneau of"fluffy" and "comfortable", and it made sense because it was used in reference to a pillow. If it's just regurgitation, I'd like to know how it's able to understand novel words not found in the training data... |
|
For words that are not in the model's vocabulary, like 'fluftable', the model uses a subword tokenization strategy. It breaks down the word into smaller known subunits (subwords or characters) and represents each subunit with its own vector. By understanding the context in which 'fluftable' appears and comparing it to known words with similar subunits, the model can infer a plausible meaning for the word. This is done by analyzing the vector space in which these representations exist, observing how the vectors align or differ from those of known words.
'As always, the most important principle for understanding LLMs is that you should resist the temptation of anthropomorphizing them.'