Hacker News new | ask | show | jobs
by Animats 497 days ago
This isn't really about "AI". It's about copying summaries. Google was fined for this in France for copying news headlines into their search results, and now has to pay royalties in the EU. Westlaw is a summarizing and indexing service for court case results. It's been publishing that info in book form since 1872.

Ross was trying to compete with Westlaw, but used Westlaw as an input. West's "Key Numbers" are, after a century and a half, a de-facto standard.[2] So Ross had to match that proprietary indexing system to compete. Their output had to match Westlaw's rather closely. That's the underlying problem. The court ruled that the objective was to directly compete with Westlaw, and using Westlaw's output to do that was intentional copyright infringement.

This looks like a narrow holding, not one that generally covers feeding content into AI training systems.

[1] https://apnews.com/article/google-france-news-publishers-cop...

[2] https://guides.law.stanford.edu/cases/keynumbersystem

2 comments

The case involves headnotes, not just key numbers. Your links provide examples of such headnotes, which make it very clear that a lot of human creativity and judgment is involved in authoring them - they're not a matter of purely factual information, such as a phonebook. Thus, the headnotes are copywritten, and translating them to a different language doesn't negate that copyright. This looks like a slam dunk case, but it has very little to do with AI training as such - the AI was only used to create a kind of rough indexing over the translated text.

If this was only about key numbers, it might have gone the other way because the fact-like element there is considerably greater.

Case law is public domain. You can publish digitized copies of Westlaw books with the headnotes, keys, and a couple of other property bits redacted. Any of their proprietary elements though, definitely including the key cites, are clearly a no-go. The headnotes not only require creativity and expertise to make, many lawyers consider them indispensable (though many other lawyers apparently throw shade at lawyers that rely on them.) And since most of the rest of the book is public domain, it’s one of the biggest, if not the biggest selling point for their texts. They famously vigorously defend their copyrights— the defendant surely knew what they were signing up for when they started doing this.
> which make it very clear that a lot of human creativity and judgment is involved in authoring them

What's funny is that any SOTA LLM today could definitely author them, and even LexisNexis advertises the fact: https://www.lexisnexis.com/community/insights/legal/b/produc...

TR may have intentionally chosen an easy battle to begin their legal war.
They began this case in 2020, before any of the most important models existed