|
|
|
|
|
by svaha1728
946 days ago
|
|
I would read Francois Chollet's explanation of this. It's very good: https://fchollet.substack.com/p/how-i-think-about-llm-prompt... For words that are not in the model's vocabulary, like 'fluftable', the model uses a subword tokenization strategy. It breaks down the word into smaller known subunits (subwords or characters) and represents each subunit with its own vector. By understanding the context in which 'fluftable' appears and comparing it to known words with similar subunits, the model can infer a plausible meaning for the word. This is done by analyzing the vector space in which these representations exist, observing how the vectors align or differ from those of known words. 'As always, the most important principle for understanding LLMs is that you should resist the temptation of anthropomorphizing them.' |
|