|
|
|
|
|
by pcwelder
202 days ago
|
|
In RNNs and Transformers we obtain probability distribution of target variable directly and sample using methods like top-k or temprature sampling. I don't see the equivalence to MCMC. It's not like we have a complex probability function that we are trying to sample from using a chain. It's just logistic regression at each step. |
|
Every response from an LLM is essentially the sampling of a Markov chain.