Hacker News new | ask | show | jobs
by tworats 1123 days ago
The Illustrated Transfomer ( https://jalammar.github.io/illustrated-transformer/ ) and Visualizing attention ( https://towardsdatascience.com/deconstructing-bert-part-2-vi... ), are both really good resources. For a more ELI5 approach this non-technical explainer ( https://www.parand.com/a-non-technical-explanation-of-chatgp... ) covers it at a high level.