Hacker News new | ask | show | jobs
by ryan-duve 713 days ago
I gave a talk on using Google BERT for financial services problems at a machine learning conference in early 2019. During my preparation, this was the only resource on transformers I could find that was even remotely understandable to me.

I had a lot of trouble understand what was going on from just the original publication[0].

[0] https://arxiv.org/abs/1706.03762

2 comments

Maybe it's easier to understand in the format of annotated code

https://nlp.seas.harvard.edu/2018/04/03/attention.html

Updated (in case above link goes away) - https://nlp.seas.harvard.edu/annotated-transformer/

Thanks for original!

Maybe Im dumb but I still can't make much sense of this.