Hacker News new | ask | show | jobs
by kgeist 943 days ago
Just yesterday I saw an example of a person asking GPT what "fluftable" means. The word was invented by their little daughter and they didn't know what it meant. GPT reasoned it was a portmaneau of"fluffy" and "comfortable", and it made sense because it was used in reference to a pillow. If it's just regurgitation, I'd like to know how it's able to understand novel words not found in the training data...
3 comments

I would read Francois Chollet's explanation of this. It's very good: https://fchollet.substack.com/p/how-i-think-about-llm-prompt...

For words that are not in the model's vocabulary, like 'fluftable', the model uses a subword tokenization strategy. It breaks down the word into smaller known subunits (subwords or characters) and represents each subunit with its own vector. By understanding the context in which 'fluftable' appears and comparing it to known words with similar subunits, the model can infer a plausible meaning for the word. This is done by analyzing the vector space in which these representations exist, observing how the vectors align or differ from those of known words.

'As always, the most important principle for understanding LLMs is that you should resist the temptation of anthropomorphizing them.'

I'm sorry, but that's absurd. Being able to explain the precise mechanism behind reasoning would make anything sound like it's not reasoning, because of our prior experiences. If we understood human reasoning well enough to explain exactly what happens in our brain, you would conclude that we're not really reasoning because you can provide an explanation of how we're reasoning about novel, out of distribution data. This is "God of the gaps" for thought.
What you've written does nothing to disabuse any reasonable person of the notion that LLMs cannot reason; if anything you've explained how LLM's reason, not that they cannot do it.
isn't 'infer' another word for reason?
vector math in a 1536-dimensional space?
Because you’re not understanding what it’s regurgitating. It’s not a fact machine that regurgitates knowledge, in fact it’s not really so good at that. It regurgitates plausible patterns of language, and combining words and such is hardly a rare pattern
Which is also within the realms of house MD vs doctor, potentially even more so.

LLMs are trained on realms of text, good performance here is not unexpected.

To put it another way - Would you hire chat GPT?

For work, you need to have more than text skills.