Hacker News new | ask | show | jobs
by throwup238 667 days ago
> The court declined to dismiss copyright infringement claims against the AI companies.

That "major win" being allowed to proceed with the case at all. All they've done is clear the first hurdle meant to kill frivolous lawsuits before they get to discovery. Their other claims were dismissed:

> Claims against the companies for breach of contract and unjust enrichment, plus violations of the Digital Millennium Copyright Act for removal of information identifying intellectual property, were dismissed. The case will move forward to discovery, where the artists could uncover information related to the way in which the AI firms harvested copyrighted materials that were then used to train large language models.

3 comments

Gotta love headlines clearly altered by someone other than the writer. The lede literally says:

>Artists suing generative artificial intelligence art generators have cleared a major hurdle

Not "major win".

I'm very excited for discovery.
Didn't the Enron dataset that's now part of the Pile become public during discovery too? Some great image datasets might drop.
IANAL but documents don't become public during discovery, they only become public if they're filed with the court (unless they're sealed). The vast majority of information dredged up during discovery remains confidential.
But things like datasets are massive and structure is important. Do they retain them digitally with the same original structure or do they transform them into some kind of massive PDF?
If the experts are playing hardball then transformations of any and everything into PDFs is an effective tactic.
No. See the various state rules of civil procedure concerning the presumptive and requested form of production of electronically stored information. An example is Ariz. R. Civ. P. 26.1(c)(3).
Related, I worked at a company that had a standards body forced information sharing agreement with a competitor. One of the requirements was that documentation had to be shared.

Unfortunately, our documentation was a very well formatted with links and was searchable, making it easy to navigate. So in an act of malicious compliance, the few thousand page document was printed then scanned to low res, jpg artifact filled, crooked, but still legible, set of images that were shared as a fairly useless pdf.

The dataset is already public. That's the only reason they were able to file this time-wasting lawsuit anyway.
Why do you think it is time wasting? Is it because of the wasted time of all the artists having gone to the bother of producing art that can now be approximated at the press of a button?
It's a waste of time because the majority of their claims were poorly constructed, disingenuous, and subsequently thrown out.

All this has done is incentivize AI research companies to be even more closed and opaque.

And that will get them more lawsuits until some judge has had enough of their tricks and forces something drastic.
Will the plaintiffs get similar relief to the one IP holders got from Megaupload, I wonder?