Hacker News new | ask | show | jobs
by moffkalast 1098 days ago
The part that says you shouldn't take outputs from their models to build datasets for training competitor models.

Outputs from models that they trained on stolen ebooks, unpaid reddit data, data scraped from millions of websites without credit, etc. Sort of like stealing a bike and then getting mad that it got stolen again later, because it was clearly rightfully yours.

https://i.pinimg.com/originals/d7/72/22/d77222df469b50e3b4cd...

1 comments

I get your point but your analogy doesn’t quite work.
Yeah it's more like stealing a million bikes, putting all parts into a pile and custom assembling them on request.
Still not exactly right. Stealing bikes deprives owners of them, while scraping data doesn’t.
How about torrenting the entirety of the world's filmography, using that content to make clips compilations on youtube, then claiming copyright strikes and demonetizing videos that contain those clips?

In a sense, it's almost patent trolling.