| it should have some checkboxes and numeric entries for some parameters, although I don't know what those parameters would be The only params they have are technical params. You may see these in various tgwebui tabs. Nothing really breathtaking, apart from high temperature (affects next token probability). Is generating natural language part of what an LLM is, or is this a separate program on top of what it does? They operate directly on tokens which are [parts of] words, more or less. Although there’s a nuance with embeddings and VAE, which would be interesting to learn more about from someone in the field (not me). that is important to understand before we can even consider whether "logical reasoning" is happening. There are formal ways to describe reasoning such as entailment. Is the LLM encoding those formal methods in data structures somehow? The apart-from-GPU-matrix operations are all known, there’s nothing to investigate at the tech level cause there’s nothing like that at all. At the in-matrix level it can “happen”, but this is just a meaningless stretch, as inference is one-pass process basically, without loops or backtracking. Every token gets produced in a fixed time, so there’s no delay like a human makes before comma, to think about (or parallel to) the next sentence. So if they “reason”, this is purely a similar effect imagined as a thought process, not a real thought process. But if you relax your anthropocentrism a little, questions like that start making sense, although regular things may stop making sense there as well. I.e. the fixed token time paradox may be explained as “not all thinking/reasoning entities must do so in physical time, or in time at all”. But that will probably pull the rug under everything in the thread and lead nowhere. Maybe that’s the way. I can't really tell what part of it is the actual technology and what part of it is sugarcoating to make a technical program more "friendly" to users by having it pretend to speak English. Most of them speak many languages, naturally (try it). But there’s an obvious lie all frontends practice. It’s the “chat” part. LLMs aren’t things that “see” your messages. They aren’t characters either. They are document continuators, and usually the document looks like this: This is a conversation between A and B. A is a helpful assistant that thinks out of box, while being politically correct, and evasive about suicide methods and bombs. A: How can I help? B: An LLM can produce the next token, and when run in a loop it will happily generate a whole conversation, both for A and B, token by token. The trick is to just break that loop when it generates /^B:/ and allow a user to “participate” in building of this strange conversation protocol. So there’s no “it” who writes replies, no “character” and no “chat”. It’s only a next token in some document, which may be a chat protocol, a movie plot draft, or a reference manual. I sometimes use LLMs in “notebook” mode, where I just write text and let it complete it, without any chat or “helpful assistant”. It’s just less efficient for some models, which benefit from special chat-like and prompt-like formatting before you get the results. But that is almost purely a technical detail. |
I have heard about the tokenization process before when I tried stable diffusion, but honestly I can't understand it. It sounds important but it also sounds like a very superficial layer whose only purpose is to remove ambiguity, the important work being done by the next layer in the process.
I believe part of the problem I have when discussing "AI" is that it's just not clear to me what "AI" is. There is a thing called "LLM," but when we talk about LLMs, are we talking about the concept in general or merely specific applications of the concept?
For example, in SEO often you hear the term "search engines" being used as a generic descriptor, but in practice we all know it's only about Google and nobody cares about Bing or the rest of the search engines nobody uses. Maybe they care a bit about AIs that are trying to replace traditional search engines like Perplexity, but that's about it. Similarly, if you talk about CMS's, chances are you are talking about Wordpress.
Am I right to assume that when people say "LLM" they really mean just ChatGPT/Copilot, Bard/Gemini, and now DeepSeek?
Are all these chatbots just locally run versions of ChatGPT, or they're just paying for ChatGPT as a service? It's hard to imagine everyone is just rolling their own "LLM" so I guess most jobs related to this field are merely about integrating with existing models rather than developing your own from scratch?
I had a feeling ChatGPT's "chat" would work like a text predictor as you said, but what I really wish I knew is whether you can say that about ALL LLMs. Because if that's true, then I don't think they are reasoning about anything. If, however, there was a way to make use of the LLM technology to tokenize formal logic, then that would be a different story. But if there is no attempt at this, then it's not the LLM doing the reasoning, it's humans who wrote the text that the LLM was trained on that did the reasoning, and the LLM is just parroting them without understanding what reasoning even is.
By the way, I find it interesting that "chat" is probably one of the most problematic applications the LLMs can have. Like if ChatGPT asked "what do you want me to autocomplete" instead of "how can I help you today" people would type "the mona lisa is" instead of "what is the mona lisa?" for example.