|
|
|
|
|
by dig6x
2179 days ago
|
|
"...600 billion parameters using automatic sharding. We demonstrate that such a giant model can efficiently be trained on 2048 TPU v3 accelerators in 4 days to achieve far superior quality for translation from 100 languages to English compared to the prior art." It does appear that at the initial, resource intensive stages of tech like NLP big tech is primed to pave the way. We saw this happen across cloud, AI more generally, storage etc. But big tech then begins focusing on making the tech accessible to industry value chains (Azure, AWS, Amazon's AI services etc.). But as the industry matures there's more room for specialized startups/companies to enter the space to capture lucrative niches - thats exactly what Snowflake did for Cloud. Definitely see this kind of scale as a step toward a more robust, mature industry if anything. Better it move forward than not. |
|