| > Do LLMs parse language to understand it, or is entirely pattern matching from training data? The real answer is neither, given "understand" and "pattern match" mean what they mean to an average programmer. > For example: "Always focus on the key points in my questions to determine my intent." How is it supposed to pattern match from that sentence (i.e. finding it in training data) to the key points in the question? A Markov chain knows certain words are more likely to appear after "key points" and outputs these words. However LLM is not a Markov chain. It also knows certain word combinations are more like to appear before and after "key points". It also knows other word combinations are more likely to appear before and after those word combinations. It also knows other other word combinations are... The above "understanding" work recursively. (It's still a quite simplistic view to it, but much better than "LLM is just a very computational expensive Markov chain" view, which you will see multiple times in this thread.) |