Y
Hacker News
new
|
ask
|
show
|
jobs
by
minimaxir
1842 days ago
There is a Colab that you can run to set up the model with TPUs:
https://colab.research.google.com/github/kingoflolz/mesh-tra...
A few demo examples:
https://twitter.com/minimaxir/status/1402468460681068544
1 comments
Firerouge
1842 days ago
It's interesting, it seems to get stuck on a sentence or word a lot, then half the output is repeating a slightly varied sentence, or rambling about that unrelated word.
link
minimaxir
1842 days ago
That's the one thing very consistent with Transformer models, even GPT-3.
link