|
|
|
|
|
by sneak
339 days ago
|
|
> Some companies, like Meta, went even further. They didn’t just use web content—they also used pirated books to train their models. (The Unbelievable Scale of AI’s Pirated-Books Problem) If a regular person did this, they’d probably get into serious trouble. But when billion-dollar companies do it, they usually get away with it. Someone should tell Anna’s Archive. The US’s criminal enforcement is very much biased into the “rules for thee, but not for me” category, but invoking it here is a trope. Anyone can get away with piracy on the scale of Books3 or The Pile. The reason random people don’t make models is because the hardware and power costs are fucking astronomical, not because they can’t get away with downloading the training data. These sort of hot takes are just as wrong as the breathless “AGI is right around the corner” ones. AI is hugely transformative, and anyone who thinks it’s overhyped doesn’t know the SOTA. It will likely be the single biggest technological advancement of our lifetime. |
|