Hacker News new | ask | show | jobs
by MrBurrito 1258 days ago
I disagree (but also, IANAL). They use the private images to train their models. Means the image is part of a data set, and this is subject to licensing. The image is however part of your private collection, so the license, besides the wording in the privacy page, is sketchy at best. Also, it will collide with GDPR and CCPA rules on data processing. Data models are subject to licensing too, and using images without proper licenses “poisons the well”.
1 comments

Usually the copyright on a data set is held by the person who accumulated the data, but mere compilation is not enough to bestow copyright. The arrangement and selection itself needs to be sufficiently creative or original.

Putting all your photos in a folder does create a dataset, but doesn't meet the threshold of creative input for that dataset to be copyrightable as a compilation.

Now, you do hold the copyright on all of your photos. This means that you can restrict their transmission to others. You can even transmit them to someone as part of a contract that they not engage in a particular activity, like training an AI.

But if they proceed to train an AI using them, they've committed a breach of contract, not copyright. And I'll tell you, I'm not a lawyer myself, but instead a SME who works with a lawyer, and in order to receive relief for breach of contract, a judge is going to look to said contract in order to calculate damages. If you have a contract that says that something is a breach, but no language that assists a judge in calculating damages, you shouldn't assume you're going to get any relief in excess of, perhaps, an injunction.

And I'm afraid my knowledge is very US-specific.