| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by 7e 17 days ago
	2 PB? They will not come close to training in on that amount. Maybe years from now.

2 comments

sgt 17 days ago

Think they will not train on the dull 2TB but use that as the data lake to start and then apply a more targeted approach.

link

winddude 17 days ago

if you read the article 2pb is available as flash storage in the data pipeline, used to dedupe, clean, normalize, etc, for training from 60pb of raw data.

link

Den_VR 17 days ago

Could probably LoRA with that

link