Y
Hacker News
new
|
ask
|
show
|
jobs
by
wholehog
552 days ago
I think the pre-trained checkpoint uses the same 20 TPU blocks as the original paper, but it probably isn't the exact-same checkpoint, as the paper itself is from 2020/2021.