|
|
|
|
|
by RugnirViking
1176 days ago
|
|
we do understand that they generate words with probabilities one by one though. They are transformers, the AI bit takes the entire prompt as input and returns a range of probabilities for the next token (token == word, it doesnt understand or see letters). A "dumb" algorithm then selects from these probabilities randomly using the probabilities returned by the ai based on the "temprature" setting (low temprature favors high probability words & high temprature selects from a whole lot of words but leads into really weird territory fast) It's also worth remembering that this is not just a markov chain as many people seem to think, it doesn't simply remember what words come next in its training set, because statistically if you take a random 10 consecutive words from a piece of text, chances are its never been written before. (also the trained model is much smaller than the size of the dataset, so to simply remember everything it would have to be the worlds best compression algorithm by orders of magnitude) Thats why we need an AI here, to learn the general rules of language so it can respond to chains of words it has never seen before. The sense that we "dont understand how they work" is that we dont know what the "rules of language" that it has learned are. |
|
This is no more helpful in understanding AI's than is knowing that human brains operate according to the laws of physics is helpful in understanding the human mind.