| HN Mirror

Probabilistic and bayesian are not identical things. Moreover, GPT the deep-learning model is not a probabilistic next-token chooser. You can envision many different ways to choose the next word based on GPT output. OpenAI's API for GPT is a probabilistic word chooser paired along with GPT. But GPT is the model. It generates a set of probability distributions for the next word, not using a Bayesian process but something entirely different. GPT takes a vector space representation of a sentence and projects it onto some space (we'll call it GPTThink) and then re-projects that space to a new vector space. Then it uses softmax to turn that vector space into a probability distribution. That's not a Bayesian process.