Hacker News new | ask | show | jobs
by minimaxir 1842 days ago
There is a Colab that you can run to set up the model with TPUs: https://colab.research.google.com/github/kingoflolz/mesh-tra...

A few demo examples: https://twitter.com/minimaxir/status/1402468460681068544

1 comments

It's interesting, it seems to get stuck on a sentence or word a lot, then half the output is repeating a slightly varied sentence, or rambling about that unrelated word.
That's the one thing very consistent with Transformer models, even GPT-3.