Hacker News new | ask | show | jobs
by Mithriil 87 days ago
Bayesian network is a really general concept. It applies to all multidimensional probability distribution. It's a graph that encodes independence between variables. Ish.

I have not taken the time to review the paper, but if the claim stands, it means we might have another tool to our toolbox to better understand transformers.