| > Was it? I've only heard of pre-training (predict next word) and subsequent RLHF + SFT "alignment" (incl. aligning to goal of being conversational). AFAIK the NLP skills that these LLMs achieve are all emergent rather than explicitly trained. I believe you are right about that. I did some research after reading your comment. Transformers were certainly designed for NLP, but with large enough models the abilities can emerge without necessarily being explicitly trained for it. > I'm not sure we can really say the net fully understands even if it answers as if it does - it was only trained to "predict next word", which in effect means being trained to generate a human-like response. It depends on your definition of "understand". If that requires consciousness then there is no universally agreed formal definition. Natural Language Understanding (NLU) is a subset of Natural Language Processing (NLP). If we take the word "understanding" as used in an academic and technical context then yes they do understand quite well. In order to simply "predict the next word" they learn an abstract model of syntax, semantics, meaning, relationships, etc, from the text. > and has no idea if the ideas it is expressing are true or not (hence all the hallucinating/bullshitting). That is not really an issue when solving tasks that are within it's context window. It is an issue for factual recall. The model is not a type of database that stores its training set verbatim. Humans have analogous problems with long term memory recall. I can think straight within my working memory but my brain will "hallucinate" to some extent when recalling distant memories. |
If you ask it a question where the training data (or input data = context) either didn't include the answer, or where it was not obvious how to get the right answer, that will not (unfortunately) stop it from confidently answering!