Hacker News new | ask | show | jobs
by lexpar 2422 days ago
Does anyone have a nice resource they recommend on what BERT does? I've gathered it was trained by trying to predict missing words in a sentence, but I don't have an intuition on how this is useful for downstream prediction (like, say, learning a word embedding is).
1 comments

I appreciated this blog post: http://jalammar.github.io/illustrated-bert/
Thanks!