Hacker News new | ask | show | jobs
by mark_l_watson 2486 days ago
Good explanation of transformers, and the history leading up to them. I look forward to the next installment covering BERT.

As someone who spent a lot of time trying to manually code up solutions to anaphora resolution (pronoun coreference), BERT seemed like a small miracle to me. As a side comment: I love that getting training data for BERT is so cheap: any text source, and randomly remove words, target output is predicting the words removed.