| It doesnt matter how big it is, it's properties dont change. eg., it never says, "I like what you're wearing" because it likes what I'm wearing. It seems there's an entire generation of people taken-in by this word, "complexity" and it's just magic sauce that gets sprinkled over ad-copy for big tech. We know what it means to compute P(word|words), we know what it means that P("the sun is hot") > P("the sun is cold") ... and we know that by computing this, you arent actaully modelling the temperature of the sun. It's just so disheartening how everyone becomes so anthropomorphically credulous here... can we not even get sun worship out of tech? Is it not possible for people to understand that conditional probability structures do not model mental states? No model of conditional probabilities over text tokens, no matter how many text tokens it models, ever says, "the weather is nice in august" because it means the weather is nice in august. It has never been in an august; or in weahter; nor does it have the mental states for preference, desire.. nor has it's text generation been caused by the august weather. This is extremely obvious, as in, simply refelect on why the people who wrote those historical text did so.. and reflect on why an LLM generates this text... and you can see that even if an LLM produced word-for-word MLK's I have a dream speech, it does not have a dream. It has not suffered any oppression; nor organised any labour; nor made demands on the moral conscience of the public. This shouldnt need to be said to a crowd who can presumably understand what it means to take a distribution of text tokens and subset them. It doesnt matter how complex the weight structure of an NN is: this tells you only how compressed the conditional probability distribution is over many TBs of all of text history. |
Literally all of the pushback you're getting is because you're trivializing the choice of model architecture, claiming that it's all so obvious and simple and it's all the same thing in the end.
Yes, of course, these models have to be well-suited to run on our computers, in this case GPUs. And sure, it's an interesting perspective that maybe they work well because they are well-suited for GPUs and not because they have some deep fundamental meaning. But you can't act like everyone who doesn't agree with your perspective is just an AI hypebeast con artist.