That's the magic of money. Download your favorite artist's discography for personal use? If the MPAA had its way (and it occasionally has), torrenting that could bankrupt you.
The AI industry - soaking up every bit of media available online for commercial purposes, often reproducing it nearly identically - has enough money and capital to influence things its way. And only its way, in case anyone was hoping this might change anything at all for the little guy.
> Download your favorite artist's discography for personal use? If the MPAA had its way (and it occasionally has), torrenting that could bankrupt you.
I don't think that there are any clear examples of cases where ONLY downloading has resulted in huge fines. All the big bankrupting level fines have been for both downloading and sharing.
You mention that 'torrenting' could bankrupt you, and that is true, but the main reason for the huge fines are that you are taking part in distribution rather than just 'downloading for personal use'.
> I don't think that there are any clear examples of cases where ONLY downloading has resulted in huge fines.
They [1, and others] been hunting and fining downloaders for over a decade now, with the only "evidence" being IP addresses connected with the torrent [2].
Given the lack of sense in treating each peer as a lost sale for damages, I think we can safely say they're only interested in making examples out of people and would absolutely go after people for only downloading if the law permitted. Thankfully it's not, but maybe they lobby to make changes in that direction to try and curb future AI industry shenanigans.
You contradict yourself. There were numerous public cases where they chased people downloading few mp3s just for themselves, and made into example case with massive fines.
If you don't understand how torrents work on technical level I suggest at least some shallow reading. Property rights holders don't care about details, as long as you tick the box of sending a single packet to somebody, off to court with ya.
The fight about digitized media for personal (entertainment / informational) use were the early aughts. The precedents crafted then don't immediately translate to these cases (novel transformative work from protected materials), and the new precedents have to account for the fact that universities have been training via "piracy" for ages.
(The magic of money factors in to the extent that they can afford the lawyers to remind the court that this isn't settled law yet).
The fact that this is propping up the entire AI industry adds additional weight. When legislating or deciding court cases, some won't be willing to pop the cash cow, some will be worried about falling behind countries that don't enforce copyright evenly. IP owners are trying to go after the AI industry, with only mixed to poor success.
Anthropic is going to trial over pirating books for training. The judge was pretty clear that even if training is fair use, the training material must be obtained legally.
These regurgitations combined with proof that a model is familiar with a work could be sufficient evidence to force discovery to determine if the work was pirated.
What's insane is copyright. How come you can own intellectual property but not pay a property tax? The ecosystem would be much healthier if to get copyright protections you should declare value of your IP (that you are obligated to sell for if the buyer pops up) and pay tax on this for every year you hold the IP.
> if to get copyright protections you should declare value of your IP (that you are obligated to sell for if the buyer pops up) and pay tax on this for every year you hold the IP
I think this would have some unpalatable consequences. Let's say an author is writing a modestly successful book series: it's not going to make them rich, but it's commercially viable and they care a lot about it for its own sake. Under this system, if the author declares a value commensurate with the (quite small) pure economic value of the IP, they have to live in fear of their right to continue working on their creation being abruptly taken away from them at any point. If they instead declare a value commensurate with the economic value + the extra value that it has to them personally, the resulting tax liability could easily tip the balance and destroy their ability to pursue their writing as a career.
You are always free to update the value before paying tax. If somebody is willing to pay more than it's worth to you they probably have an idea how to turn it into more economic value for the society. So the society should allow them to do that. For a price, of the tax. What I'm proposing is about the financial rights. Individual right, like the right to call yourself author of any given creation should be inalienable.
There are always some cases on the edge. The question is if saving them is worth the cost of the major players running rampant.
>What's insane is copyright. How come you can own intellectual property but not pay a property tax? The
Most jurisdictions that have "property tax" only apply it on certain types of property, most commonly real estate. So it's not that weird that IP isn't taxed.
Can you imagine if we evaluated property taxes this way? Yeah, nice single family home, better hope nobody offers you the same amount you paid for it or it's back to apartment living for you and your kids.
And it seems to be because the training data is largely unofficial subtitles from movies. Which often have a string like "Translated by X" at the end of the movie which is often silent while credits roll.
Looks like they used more official sources for German - there, silence is apparently hallucinated as "Untertitelung des ZDF für funk, 2017" according to one of the comments on the issue. Which makes sense, as the public broadcasters' "Mediathek" is probably the largest freely available resource of subtitled videos in Germany. I wonder if the ZDF gave its approval for it being used for LLM training though?
I'm being made to pay for Autobahnen I barely use, finance kindergartens despite not having a child, and made to pay into public pensions with little hope of getting close to the same value out. All under threat of imprisonment, many without a way to even refuse (not that I'd want to) The only thing that sets the pubic broadcasting fee apart is that it's collected separately from taxes in an attempt to reduce the influence politicians have on broadcasters
This person refers to the German television and radio fee (Rundfunkgebühren).[1] It is a state-mandated system that ensures free (as in free speech) and (relatively) neutral public broadcasting institutions. There is a constant and engaged discussion, because every household in Germany has to pay this fee. Exceptions are made only for low-income households.
A constant discussion, lately fueled by extremist parties (AfD) who feel treated unfairly by (amongst others) the public broadcasters (which has parallels to Trump's recent campaign against public broadcasters in the US).
Ah, ok, thanks for the info, TIL! "We are funk – the first public service content network that started on October 1, 2016. We create online-only content on social networks and third-party platforms, including YouTube, Instagram, Snapchat, TikTok, Spotify, Apple Music or Twitch for 14-29 year-olds." (https://presse.funk.net/das-ist-funk/, scroll down for the English version). I live in Germany, and I even watch public broadcasters regularly, but this is the first time I have heard about funk (I even initially thought it was misspelled, usually it's written with a capital F). But I'm not part of the targeted audience (not now, nor even back in 2016 when it was launched), so all good...
> We have a public service mandate, which means that we have very clear responsibilities according to the state media treaty. For us, this means that our top priority is actually reaching our target audience, namely approximately 15 million people living in Germany between the age of 14 and 29 who have internet access
It's not a binding contract for sure but I don't think that OpenAI or other AI scraper is their target.