Hacker News new | ask | show | jobs
by godelski 1429 days ago
If you're starting from scratch scratch, these might be of more use to you. Second focuses on vision transformers, but all the concepts still apply.

https://jalammar.github.io/illustrated-transformer/

https://medium.com/pytorch/training-compact-transformers-fro...