Hacker News new | ask | show | jobs
by trimbo 636 days ago
> That's just how the internet works. Don't put something on the internet if you don't want it to be globally distributed and copied

Until now, this has been an acceptable tradeoff because there's some friction to theft. Directly cloning the work is easy, but that also means an artist can sue or DMCA. It also means the original artist's work can go more viral, which, despite the short-term downsides, can help their popularity long term.

The important difference is that imitating an artist's style with new work used to take significant time (hours or days). With an LLM, it takes milliseconds, and that model will be able to churn out the likes of your work millions of times per day, forever. That's the difference, and why the dilemma is new.

> Ultimately "AI did it" should never be allowed to be used an as an excuse

With the exception of an LLM directly plagiarizing, the only way to prove it didn't is by not allowing it to train on something. LLMs are the sum of everything. We could say the same about humans, sure, we are a model trained on everything we've ever seen too. But humans aren't machines who can recreate stuff in the blink of an eye, with nearly perfect recall, at millions of qps.

3 comments

"That's just how the internet works" is nonsensical when AI is changing how the internet works.

Just because the tradeoffs of sharing on the internet used to work before AI, doesn't mean those tradeoffs continue to be workable after AI.

It's like having drones follow everyone around and publish realtime telephoto video of them because they have "no expectation of privacy" in public places.

Maybe before surveillance tech existed, there was no expectation of privacy in public places, but now that surveillance tech exists, people naturally expect that high-res video of their every move won't be collected, archived and published even if they are in public.

> Maybe before surveillance tech existed, there was no expectation of privacy in public places, but now that surveillance tech exists, people naturally expect that high-res video of their every move won't be collected, archived and published even if they are in public.

currently, that'd be an unrealistic expectation. I'd agree that it would be nice if that wasn't the case but laws need to catch up with technology. Right now, AI doesn't change things too much since a company who publishes something that violates copyright law is still breaking the law. It shouldn't matter if an AI was used to create the infringing copy or not.

I'm all for new laws giving extra rights to people on top of what we already have if needed, but generally copyright law is already far too oppressive so I'd need to consider a specific proposed law and its impacts.

The topic of expectations reminds me of this article

https://spectrum.ieee.org/online-privacy

I think that "shifting baseline syndrome" is a major issue, but on the privacy side of things people don't seem to really understand where we're at currently, and they seem to be very good at lying to themselves about it.

You can find youtube videos of people outright screaming at photographers in public, insisting that no one has a right to take a picture of them without their permission while the entire time they're also standing under surveillance cameras.

When it's in their face they genuinely seem to care about privacy a lot, but they also know the phone in their pocket is so much more invasive in every way. They've been repeatedly told that they're tracked and recorded everywhere.They sign themselves up for it again and again. As long as they don't see it going on right in front of them in the most obvious way possible I guess they can lie to themselves in a way that they can't when they see a man with a camera, but even though on some level they already know that the street photographer is easily the last thing that should concern them, they still get upset to the point where they're screaming in public. I really don't understand it.

There's little that any individual can realistically do about smartphone spying. There are situations where it's borderline unworkable to not have a smartphone. The importance of the phone increased greatly over the same period during which companies increased tracking or at least acknowledged that they were doing it.

Leaving that aside, people probably react that way because the corporation just wants to gather advertising data from everyone, impersonally, while the photographer is taking a direct and specific interest.

> Leaving that aside, people probably react that way because the corporation just wants to gather advertising data from everyone, impersonally,

There's nothing more personal than the collection, use, and sale of every intimate detail of your life. And corporations don't just want to gather advertising data. They want to collect as much data as they possibly can in order to use it in any and every way that might somehow benefit them and the vast majority of the time that means using it against you. It stopped being about advertising decades ago.

That data is now used to set the prices you pay, it determines what jobs you get, it influences where you are allowed to live. Companies have multiple versions of their policies and they use that data to decide which version they will apply to you, how you will be treated by them, even how long they leave you on hold when you call them. That data is used to extract as much money from you as possible. It's used to manipulate you and to lie to you more effectively. It is used against you in court rooms. It can get you questioned or arrested by police even if you've done nothing wrong. It gets bought by scammers, extremists, and activists looking for targets. Everyone who collects, sells, and buys your data is only looking to help themselves so that data only ever ends up hurting you.

More and more that data has very real world consequences on your daily offline life. You're just almost never aware of it. A company that charges you 10% more than the last person when you buy the same item isn't going to tell you that it was because of the data they have on you, you just see the higher price tag and assume it applies to everyone. The company that doesn't hire you because of something they found in the dossier they bought from a data broker isn't going to inform you that it was a social media post from 15 years ago that made them pass you over, or the fact that you buy too much alcohol, or that you have a history of depression, they'll just ghost you.

If the data companies collect only ever determined what ads you see nobody would care, but that data is increasingly impacting your life in all kinds of ways. It never goes away. You have no ability to correct errors in the record. You aren't allowed to know who has it or why. You can't control what anyone does with it.

The guy taking pictures and video on the street probably isn't looking to spend the rest of his life using that footage against you personally, but that's exactly what the companies collecting your data are going to do with it and if/when they die they'll sell that data to someone else before they go and that someone else will continue using it to try and take something from you.

Yup, this is just a new-age tragedy of the commons. As soon as armies of sheep come to graze, or consume your content, the honeymoon's over.
> With an LLM, it takes milliseconds, and that model will be able to churning out the likes of your work millions of times per day, forever.

AI does cause a lot of problems in terms of scale. The good news is that if AI churns out millions of copies of your copyrighted works you're entitled to compensation for each and every copy. In addition to pushing out copies of copyrighted material, AI is also capable of writing up DMCA notices and legal paperwork.

> With the exception of an LLM directly plagiarizing, the only way to prove it didn't is by not allowing it to train on something. LLMs copy everything and nothing at the same time.

An AI's output should be held to the exact same standard as anyone else's output. If it's close enough to someone else's copyrighted work to be considered infringing then the company using that AI should be liable for copyright infringement the same way they would be if AI had never been involved. AI's ability to produce a large number of infringing works very quickly might even be what causes companies to be more careful about how they use it. Breaking the law at speeds approaching the speed of light isn't a good business model.

Outside of competing profit motives, there is no dilemma. It's that underlying motive, and it's root, that will have to undergo a drastic change. The Pandora that AI is is already out of the box and there's no putting it back in; only dealing with the consequences.