The neural net model is condensed to 800 GB.
https://www.springboard.com/blog/data-science/machine-learni...
Note that the "compression" there also includes the "intelligence" that it presents - you might be able to get some powerful compression of English text... but you can't ask a gzip file to come up with a joke about cats and dinosaurs.
A typical single-spaced page is 500 words long
That’s 179,280,000 full pages of text.
I wonder if they excluded any duplicated text.
I’ve only done image classifiers and object detectors so I was assuming they must be trained with similar pure datasets.
The neural net model is condensed to 800 GB.
https://www.springboard.com/blog/data-science/machine-learni...
Note that the "compression" there also includes the "intelligence" that it presents - you might be able to get some powerful compression of English text... but you can't ask a gzip file to come up with a joke about cats and dinosaurs.