Hacker News new | ask | show | jobs
by autoexec 636 days ago
> but if you put your work out there and anything public is fair game, then it will be sampled by a computer and instantly recreated at scale.

That's just how the internet works. Don't put something on the internet if you don't want it to be globally distributed and copied.

> I personally know two artists who have sued major companies who ripped off their work for ads, and both won million-plus settlements.

Ultimately "AI did it" should never be allowed to be used as an excuse. If a company pays for a marketing guy who rips off someone's work and they can be sued for it, then a company that pays for an AI that rips off someone's work should still be able to be sued for it.

5 comments

> That's just how the internet works. Don't put something on the internet if you don't want it to be globally distributed and copied

Until now, this has been an acceptable tradeoff because there's some friction to theft. Directly cloning the work is easy, but that also means an artist can sue or DMCA. It also means the original artist's work can go more viral, which, despite the short-term downsides, can help their popularity long term.

The important difference is that imitating an artist's style with new work used to take significant time (hours or days). With an LLM, it takes milliseconds, and that model will be able to churn out the likes of your work millions of times per day, forever. That's the difference, and why the dilemma is new.

> Ultimately "AI did it" should never be allowed to be used an as an excuse

With the exception of an LLM directly plagiarizing, the only way to prove it didn't is by not allowing it to train on something. LLMs are the sum of everything. We could say the same about humans, sure, we are a model trained on everything we've ever seen too. But humans aren't machines who can recreate stuff in the blink of an eye, with nearly perfect recall, at millions of qps.

"That's just how the internet works" is nonsensical when AI is changing how the internet works.

Just because the tradeoffs of sharing on the internet used to work before AI, doesn't mean those tradeoffs continue to be workable after AI.

It's like having drones follow everyone around and publish realtime telephoto video of them because they have "no expectation of privacy" in public places.

Maybe before surveillance tech existed, there was no expectation of privacy in public places, but now that surveillance tech exists, people naturally expect that high-res video of their every move won't be collected, archived and published even if they are in public.

> Maybe before surveillance tech existed, there was no expectation of privacy in public places, but now that surveillance tech exists, people naturally expect that high-res video of their every move won't be collected, archived and published even if they are in public.

currently, that'd be an unrealistic expectation. I'd agree that it would be nice if that wasn't the case but laws need to catch up with technology. Right now, AI doesn't change things too much since a company who publishes something that violates copyright law is still breaking the law. It shouldn't matter if an AI was used to create the infringing copy or not.

I'm all for new laws giving extra rights to people on top of what we already have if needed, but generally copyright law is already far too oppressive so I'd need to consider a specific proposed law and its impacts.

The topic of expectations reminds me of this article

https://spectrum.ieee.org/online-privacy

I think that "shifting baseline syndrome" is a major issue, but on the privacy side of things people don't seem to really understand where we're at currently, and they seem to be very good at lying to themselves about it.

You can find youtube videos of people outright screaming at photographers in public, insisting that no one has a right to take a picture of them without their permission while the entire time they're also standing under surveillance cameras.

When it's in their face they genuinely seem to care about privacy a lot, but they also know the phone in their pocket is so much more invasive in every way. They've been repeatedly told that they're tracked and recorded everywhere.They sign themselves up for it again and again. As long as they don't see it going on right in front of them in the most obvious way possible I guess they can lie to themselves in a way that they can't when they see a man with a camera, but even though on some level they already know that the street photographer is easily the last thing that should concern them, they still get upset to the point where they're screaming in public. I really don't understand it.

There's little that any individual can realistically do about smartphone spying. There are situations where it's borderline unworkable to not have a smartphone. The importance of the phone increased greatly over the same period during which companies increased tracking or at least acknowledged that they were doing it.

Leaving that aside, people probably react that way because the corporation just wants to gather advertising data from everyone, impersonally, while the photographer is taking a direct and specific interest.

Yup, this is just a new-age tragedy of the commons. As soon as armies of sheep come to graze, or consume your content, the honeymoon's over.
> With an LLM, it takes milliseconds, and that model will be able to churning out the likes of your work millions of times per day, forever.

AI does cause a lot of problems in terms of scale. The good news is that if AI churns out millions of copies of your copyrighted works you're entitled to compensation for each and every copy. In addition to pushing out copies of copyrighted material, AI is also capable of writing up DMCA notices and legal paperwork.

> With the exception of an LLM directly plagiarizing, the only way to prove it didn't is by not allowing it to train on something. LLMs copy everything and nothing at the same time.

An AI's output should be held to the exact same standard as anyone else's output. If it's close enough to someone else's copyrighted work to be considered infringing then the company using that AI should be liable for copyright infringement the same way they would be if AI had never been involved. AI's ability to produce a large number of infringing works very quickly might even be what causes companies to be more careful about how they use it. Breaking the law at speeds approaching the speed of light isn't a good business model.

Outside of competing profit motives, there is no dilemma. It's that underlying motive, and it's root, that will have to undergo a drastic change. The Pandora that AI is is already out of the box and there's no putting it back in; only dealing with the consequences.
> That's just how the internet works. Don't put something on the internet if you don't want it to be globally distributed and copied.

You could make the same argument about paper. "That's just how photocopiers work! If you don't want your creations to be endlessly duplicated and sold, don't write them down!" Heck, you could make the same argument about leaving the house. "That's just how guns work! Don't go out in public if you don't want to take the risk of getting shot!"

But it's a bad argument every time. That something is technically possible doesn't make it morally right. It's true that a big point of technology is to increase an individual's power. But I'd say that increased power doesn't diminish our responsibility for our actions. It increases it.

> You could make the same argument about paper. "That's just how photocopiers work! If you don't want your creations to be endlessly duplicated and sold, don't write them down!"

No, the argument would be about photocopies, not paper. "That's just how photocopiers work! Don't put something into a photocopier if you don't want photocopies of it." It isn't possible for anyone to access anything on the internet without making copies of that thing. Copies are literally how the internet works.

Shooting everyone who steps outside isn't how guns work either so that also fails as an analogy.

The internet was specifically designed for the global distribution of copies. If that isn't what you want, don't publish your works there.

> That something is technically possible doesn't make it morally right.

Morality is entirely different from how the internet works, but in practice, I don't see anything immoral about making a copy of something. Morality only becomes an issue when it comes to what someone does with that copy.

> If that isn't what you want, don't publish your works there.

"Women are oppressed in Iran. Well, that's just how Iran is. Just leave it if you don't want to be oppressed"

Oh my. Yea, and whatever is some way, is that way – "it is how it is, deal with it". It's an empty statement. The topic is an ethical and political discussion in light of current technologies. It's a question of whether it should work this way. That's how all moral questions come about – by asking if something should be the way it is. And the current state of technology brings a dilemma that hasn't existed before.

And no, the internet was not designed for that. Quite obviously. Sounds like you haven't heard of private messages.

I'm very surprised this has to be stated.

> "Women are oppressed in Iran. Well, that's just how Iran is. Just leave it if you don't want to be oppressed" Yea, and whatever is some way, is that way – "it is how it is, deal with it". It's an empty statement.

No, because Iran can stop oppressing women and still exist as a functional country. oppressing women today is "how it is". The internet on the other hand is designed to be a system for the distribution of copies. That isn't "how it is", but rather "what it is".

The internet cannot do anything except distribute copies and anything that doesn't distribute copies wouldn't be the internet.

> Sounds like you haven't heard of private messages.

Private messages are also not what is being discussed here. The comment being discussed said: "I don't agree because it creates this dilemma for creators: you need to put your work out there to get traction, but if you put your work out there and anything public is fair game, then it will be sampled by a computer and instantly recreated at scale."

"anything public". For what it's worth though, private messages are still copies.

all received messages are copies of the original. a broadcast is a copy. language itself is an incessant copying. so it’s a truism to say that of the internet. and a generality that doesn’t apply to specifics. downloading cracked software is also copying, but this genus is irrelevant to its discussion. its morality is beside its being a copy, even though it is essential that it be a copy. likewise with other data.

we don’t have rules set yet, that’s why this discussion is active, ie, not just a niggle from after a couple of beers. it’s a question of respect for the author.

yea so everybody can copy a book just like the internet and nobody is persecuted for memorizing it.
Yes, if one over-narrowly construes any analogy, it can be quickly dismissed. I suppose that's my fault for putting an analogy on the internet.

We've had copying technologies since people invented the pen. It was such an important activity that there were people who spent their whole lives copying texts.

With the rise of the printing press, copying became a significant societal concern, one so big that America's founders put copyright into the constitution. [1] The internet did add some new wrinkles, but if anything the surprise is is that most of the legal and moral thinking that predates it translated just fine to the internet age. That internet transmission happens to make temporary copies of things changed very little, and certainly not the broad principles.

I understand why Facebook and other people lining their pockets would like to claim that they are entitled to take what they want. But we don't have to believe them.

[1] https://constitution.congress.gov/browse/essay/artI-S8-C8-1/...

I don't think that facebook should be allowed to violate copyright law, but clearly they have the same rights as you do to copy works made publicly avilable on the internet.
We are talking about more than the current law here. We're talking about what the law should be, based on what people see as right. And I'd add that Facebook is doing a lot more here than just quietly having a copy of something.
If the concern isn't copyright infringement what would the new law be about? What's the harm people want solved? Is it just some philosophical objection, like people not liking the idea of other people doing something they don't like? Is it fear of future potential harms that haven't been seen in real life yet?

Is it just "someone else may be using what I published publicly to make money somehow in a way that is legal and doesn't infringe on my copyrights but I still don't like it because they aren't giving me a cut?" What would the new law look like?

> You could make the same argument about paper.

Most paper doesn't come with Terms and Conditions that everything you write on it belongs to the paper company. I hate Facebook (with a fiery passion) but people gave them their data in exchange for the groundbreaking and unprecedented ability to make friends with another person (which has never been done before). It sucks, but don't use these "free" systems without understanding the sinister dynamics and incentives behind them.

People make the same arguments about the NSA. "They aren't doing anything bad with the data their collecting about every US citizen." Well, at some point they will. Stop borrowing against future freedom for a tiny bit of convenience today.

I think you're confusing a legal point (whether a T&C really gives Facebook any particular legal right in court) with the moral question of whether or not people should just roll over for large companies because of language we all, Facebook included, know that nobody ever reads.

Even if FB's T&C made it clear they could do this (something I haven't seen proven), that at best means people would have a hard time suing as individuals. They can still get upset. They can still protest to the regulators and legislators whose job it is to keep these companies in line, and who create the legal context that gives a T&C document practical meaning.

> That's just how the internet works. Don't put something on the internet if you don't want it to be globally distributed and copied.

And if someone takes a picture of your artwork, or takes a picture of your person, and posts that to the internet without your consent? Have you given up your rights then?

My answer: Absolutely not.

What AI does is much more like the Old Masters approach of going to a museum and painting a copy of a painting by some master whose technique they wish to learn. This has always been both legal, and encouraged.

Or borrowing a thick stack of books from the library, reading them, and using that knowledge as the basis for fiction. That's a transformative work, and those are fine as well.

My take is that training AI models is a bespoke copyright situation which our laws were never designed to handle, and finding an equitable balance will take new law. But as it stands, it's both legal and encouraged for a human to access a Web site (thereby making a copy) and learn from the contents of that website.

That is, fundamentally, what happens when an LLM is trained on corpus data. The difference in scale becomes a difference in kind, but as I said, our laws at present don't really account for that, because they weren't designed to.

LLMs sometimes plagiarize, which is not ok, but most people, myself included, wouldn't consider the dilemma satisfactorily resolved if improvements in the technology meant that never happened. Outside of that, we're talking about a new kind of transformative work, and those are legal.

> This has always been both legal, and encouraged.

Not always. The copy must be easily identifiable as copy. An exact reproduction can't have the same dimensions as the original for example.

Drawing just a person or a detail of the picture, or redoing the picture in a different context or style, is encouraged.

Selling a full scale photo of the picture is forbidden. The copyright of famous art belongs to the museum.

The second example is better than the first, yes. I was thinking about the process more than the fact that painting a study produces a work, and a derived one at that, so more normal copyright considerations apply to the work itself.

> An exact reproduction can't have the same dimensions as the original

This is a rule, not a law, and a traditional and widespread one. Museums don't want to be involved in someone selling a forgery, so that rule is a way of making it unlikely. But the difference between "if you do this a museum will kick you out" and "this is illegal" is fairly sharp.

> The copyright of famous art belongs to the museum.

Not in a great number of cases it doesn't, most famous art is long out of copyright and belongs to the public domain. Museums will have copyright on photos of those works, and have been known to fraudulently claim that photos taken by others owe a license fee to the museum, but in the US at least this isn't true. https://www.huffpost.com/entry/museum-paintings-copyright_b_...

Nice scapegoating Anthropomorphized.

Correct analogy is like someone taking pictures of the paintings, going home and applying a photoshop filter, erasing the original signature and adding theirs.

The law already covers that very much so.

If someone takes a picture of me while I'm in public that picture is their copyrighted work and they have every right to post that on the internet. There is no expectation of privacy in public, and Americans have very few rights against other people using photos/video of them (there are some exceptions for things like making someone into your company's spokesperson against their will)

If someone took a photo of my copyrighted work, their photo becomes their copyrighted work. They also have a right to post that picture on the internet without my consent. Every single person who takes a picture of a painting in a museum and posts it to social media is not a criminal. There are legal limitations there too however and that's fine because we have an entire legal system created to deal with that which didn't go away when AI was created.

If a company uses AI to create something that under the law violates your copyright you can still sue them.

> That's just how the internet works. Don't put something on the internet if you don't want it to be globally distributed and copied.

This is true for average people. Is it true for the wealthy? Is it true for Disney? Does our law acknowledge this truth and ensure equal justice for all?

It's 100% true for everyone. You can't access anything at disney.com without making a copy of that thing. Disney can't access anything at yourdomain.whatever without making a copy of that thing.

Whatever crimes either of you can get away with using your copies is another matter entirely. Any rights you had under the legal system you had before AI haven't gone away, neither have the disadvantages you have against the wealthy.

One of the comments you replied to was complaining that their work would be copied and used in training LLMs or other lucrative algorithms, and then you responded taking about how it's common to temporarily copy data into RAM to show a web page. Those are very different, and bringing up such technical minutia is not helpful to the discussion.

If someone asks "how can I share my work online without it being copied?", "actually, you can't share it without people copying it into RAM" is not the answer they're looking for. That answer it too technical, too focused on minutia, and our laws recognize that.

The point is that "copies" was never the problem. "sampled by a computer and instantly recreated at scale" is the expected outcome of publishing something publicly on the internet.

Their problem was copyright infringement and like you said, our laws recognize that problem. We have an entire legal framework for dealing with companies that publish infringing copies of copyrighted works. None of that has changed with LLMs.

If a company publishes something that violates copyright law they can be sued for it, it shouldn't matter if an AI was involved in the creation of what was published or not.

> That's just how the internet works. Don't put something on the internet if you don't want it to be globally distributed and copied.

Or we could be ethical and encourage others to be ethical.

I see you're one of the ones that wouldn't download a car.
I would share a car I had rights to, and download a car made free to me. Facebook would certainly sue me if it were their car, they should thus be held to that standard in my personal opinion.
We could make a distinction between individuals and companies doing it
Depends on the risk assesment but I'd say I'm a lot more like Robin Hood. Facebook is obviously Prince John.
Okay, but that doesn't change how the Internet works.

Encouraging people to be ethical isn't actually a real way to prevent people copying photos you put up online.

We can encourage profit-driven megacorps to be ethical? Sure, by abolishing them. Otherwise, you're just screaming into the void.
I think what I said is a prerequisite for that. There will be no structural changes without widespread cultural changes.