|
|
|
|
|
by ars
750 days ago
|
|
Do LLM's parse language to understand it, or is entirely pattern matching from training data? i.e. do the programmers teach it English is it 100% from training? Because if they don't teach it English it would need to find some kind of similar pattern in existing text, and then know how to use it to modify responses, and I don't understand how it's able to do that. For example: "Always focus on the key points in my questions to determine my intent." How is it supposed to pattern match from that sentence (i.e. finding it in training data) to the key points in the question? |
|
If you take all the training examples where "focus", "key points", "intent" or other similar words and phrases were mentioned, how are these examples statistically different from otherwise similar examples where these phrases were not mentioned?
That's what LLMs learn. They don't have to understand anything because the people who originally wrote the text used for training did understand, and their understanding affected the sequence of words they wrote in response.
LLMs just pick up on the external effects (i.e the sequence of words) of peoples' understanding. That's enough to generate text that contains similar statistical differences.
It's like training a model on public transport data to predict journeys. If day of week is provided as part of the training data, it will pick up on the differences between the kinds of journeys people make on weekdays vs weekends. It doesn't have to understand what going to work or having a day off means in human society.