Hacker News new | ask | show | jobs
by _giorgio_ 912 days ago
> the paper that was most helpful for me was [Formal Algorithms for Transformers](https://arxiv.org/abs/2207.09238)

Interesting but hard to read since it uses a quite unique notations for matrix indexing and multplication. Why???