Hacker News new | ask | show | jobs
by isaacfung 1158 days ago
Not the guy you asked, but these are often recommended.

https://jalammar.github.io/illustrated-transformer/

https://nlp.seas.harvard.edu/2018/04/03/attention.html

2 comments

Before going and digging into these, could you also explain what the necessary background is for this stuff to be meaningful?

In spite of having done a decent amount with neural networks, I'm a bit lost at how we suddenly got to what we're seeing now. It would be really helpful to understand the progression of things because I stepped away from this stuff for maybe 2 years and we seem to have crossed an ocean in the intervening time.

I am the guy asked and I endorse this guy's endorsements.