Hacker News new | ask | show | jobs
by sytelus 2684 days ago
CommonCrawl already has open dataset in petabyte size ready on AWS. Even if it didn’t exist, scrapping 80GB of data in AWS is trivial. I am surprised authors considered this as such a big deal. Also notice that performance is not anywhere close to humans. It sort of works and it’s astonishing that it does but long way to go before we have to fear weaponizing text generation.
1 comments

I think the big deal is the size of model, BERT large is 300M params, and this one is 1.5B. Bert has been trained on pod with 64 TPUs, and this model requires even larger GPU/TPU cluster. There is no way indie underfunded researcher can train such model.