Hacker News new | ask | show | jobs
by traveler01 1044 days ago
For me the question is highly debatable. As far as I'm aware the training of AI works by crawling various content from the Internet, so they're using a product, which are NY articles to train an AI, meaning they're using their content to help creating a product.

But in this sense, shouldn't they be suing Google as well? Since Google as a search engine, also crawls the web and shows their articles in their search results, usually it may even use them for those quick answers features.

My 5 cents on this are that NY Times noticed OpenAI has deep pockets, they may have ground to sue and decides to try their look in order to get some quick easy money. Now, I don't know if what OpenAI is doing with ChatGPT does not fall under fair use.

2 comments

News organizations in other jurisdictions already have achieved settlements with Google (which has much deeper pockets than OpenAI)

But there's a fairly obvious difference in use between using content to index it and point to it and generate revenue for it and using content to generate alternative content...

If you're using their content to generate more content, doesn't it fall under fair use?
In the case of Google, merely indexing content is not considered fair use. It's a double edged sword, as media outlets have realized, after all Google is responsible for a larger part of the success of these publications.

But in Germany for example, Google News basicially just copypasted articles into their service, and monetizing it without involving the publishers. That doesn't qualify as transformative even under their own rules (see the YouTube TOS and copyright enforcement system).

The trend with Google and other search engines over the past ten years has been for them to incorporate more and more content on their own pages. It's hard to remember that not so long ago Google search results pages were just lists of web pages bereft of any other content.

Today, if you Google for a song lyric, that lyric appears in Google. You get a tiny grey source link to Musixmatch or whatever but why would anyone bother to go there if you have the complete lyric right there on the page you are looking at?

More and more content real estate has appeared on search engines' pages. Answer boxes answering questions, again with a source link few people will use, and of course the large Knowledge Graph panel filled with Wikipedia content (written by volunteers and monetised by the world's richest tech companies).

The result is that it's tech companies and their platforms that make most of the money off content (YouTube is another example). They are the oil companies of today, and like the latter use all sorts of lobbying to make sure things are organised to their advantage.

For all the ingenuity and usability they offer, they behave at least in part like parasites. They should be forced to spread the wealth round a bit more.

For lyrics, why should Musixmatch get the page view anyways? The musician/song writer owns the copyright
How does your argument relate to the one of the previous posters? Two rights don't make a wrong, nevermind that some of these sites do actually buy the rights to display the lyrics off of artists songs.
That depends under whose jurisdiction you're talking about, how it's commercialised, how closely the content resembles the original content or whether it incorporates trademarks etc etc.
> But in this sense, shouldn't they be suing Google as well? Since Google as a search engine, also crawls the web and shows their articles in their search results, usually it may even use them for those quick answers features.

Not sure if NYT is involved as a plaintiff, but this has happened in Europe: https://en.wikipedia.org/wiki/Ancillary_copyright_for_press_...