Hacker News new | ask | show | jobs
The Bayesian Geometry of Transformer Attention (arxiv.org)
4 points by samwillis 162 days ago
1 comments

Higher level overview and links to the other related papers: https://medium.com/@vishalmisra/attention-is-bayesian-infere...