Hacker News new | ask | show | jobs
by rectang 462 days ago
> Look at the primary economic claim offered by AI companies: to end the need for a substantial portion of all jobs on the planet.

And this is why AI training is not "fair use". The AI companies seek to train models in order to compete with the authors of the content used to train the models.

A possible eventual downfall of AI is that the risk of losing a copyright infringement lawsuit is not going away. If a court determines that the AI output you've used is close enough to be considered a derivative work, it's infringement.

3 comments

I've pointed this out to a few people in this space. They tend to suggest that the value in AI is so great this means we should get rid of copyright law entirely.
That value is only great if it's shared equitably with the rest of the planet.

If it's owned by a few, as it is right now, it's an existential threat to the life, liberty, and pursuit of a happiness of everyone else on the planet.

We should be seriously considering what we're going to do in response to that threat if something doesn't change soon.

Yep. The "wouldn't it be great if we had robots do all the labor you are currently doing" argument only works if there is some plan to make sure that my rent gets paid other than me performing labor.
It depends if you're the only one out of a job. If it really is everyone then the answer will likely be some variant of metaphorically or literally killing your landlord in favor of a different resource allocation scheme. I put these kinds of things in a "in that world I would have bigger problems" bucket.
And that's the ultimate fail of capitalist ethics - the notion that we must all work just so we can survive. Look at how many shitty and utterly useless jobs exist just so people can be employed on them to survive.

This has to change somehow.

"Machines will do everything and we'll just reap the profits" is a vision that techno-millenialists are repeating since the beginnings of the Industrial Revolution, but we haven't seen that happening anywhere.

For some strange reason, technological progress seem to be always accompanied with an increase on human labor. We're already past the 8-hours 5-days norm and things are only getting worse.

> And that's the ultimate fail of capitalist ethics - the notion that we must all work just so we can survive. Look at how many shitty and utterly useless jobs exist just so people can be employed on them to survive.

This isn't a consequence of capitalism. The notion of having to work to survive - assuming you aren't a fan of slavery - is baked into things at a much more fundamental level. And lots of people don't work, and are paid by a welfare state funded by capitalism-generated taxes.

> "Machines will do everything and we'll just reap the profits" is a vision that techno-millenialists are repeating since the beginnings of the Industrial Revolution, but we haven't seen that happening anywhere.

They were wrong, but the work is still there to do. You haven't come up with the utopian plan you're comparing this to.

> For some strange reason, technological progress seem to be always accompanied with an increase on human labor.

No it doesn't. What happens is not enough people are needed to do a job any more, so they go find another job. No one's opening barista-staffed coffee shops on every corner in the time when 30% of the world was doing agricultural labour.

> This isn't a consequence of capitalism.

Yes, it is. The fact we have welfare isn't a refutation of that, it's proof. The welfare is a bandaid over the fundamental flaws of capitalism. A purely capitalist system is so evil, it is unthinkable. Those people currently on welfare should, in a free labor market, die and rot in the street. We, collectively, decided that's not a good idea and went against that.

That's why the labor market, and truly all our markets, are not free. Free markets suck major ass. We all know it. Six year olds have no business being in coal mines, no matter how much the invisible hand demands it.

> That value is only great if it's shared equitably with the rest of the planet.

I think this should be an axiom which should be respected by any copyright rule.

You are correct, but the real problem is that copyright needs complete reform.

Let's not forget the basis:

> [The Congress shall have Power . . . ] To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.

Is our current implementation of copyright promoting the progress of science and useful arts?

Or will science and the useful arts be accelerated by culling back the current cruft of copyright laws?

For example, imagine if copyright were non-transferable and did not permit exclusive licensing agreements.

The "publisher bootstrap kit + revenue sharing agreement" would become ubiquitous overnight.

Copyright isn't the problem. Over-financialization is the problem.

AI is going to implode within 2 years. Once it starts ingesting its own output as training data it is going to be at best capped at its current capability and at worst even more hallucinatory and worthless.
The mistake you make here is to forget that the training data of the original models was also _full_ or errors and biases — and yet they still produced coherent and useful output. LLM training seems to be incredibly resilient to noise in the training set.
Forget what it eats to continue improving.

Realize what it already has.

A foundational language model with no additional training is already quite powerful.

And that genie isn't going back into the bottle.

Nonsense. Some of the current best AI models were specifically trained on AI output.
That's a talking point for bros looking to exploit it as their ticket.

"The upside of my gambit is so great for the world, that I should be able to consume everyone else's resources for free. I promise to be a benevolent ruler."

"What's good for Milo Minderbinder is good for the world."
…meaning that whatever model results would have no protection, and would be free for anyone to use?
That's not how conservatism works. AI oligarchs are part of the "in" group in the "there are laws that protect but do not bind the in group, and laws that bind but do not protect the out group" summary. Anyone with a net worth less than FOTUS is part of the "out" group.
AI is worthless without training data. If all content becomes AI generated because AI outcompetes original content then there will be no data left to train on.

When Google first came out in 1998, it was amazing, spooky how good it was. Then people figured out how to game pagerank and Google's accuracy cratered.

AI is now in a similar bubble period. Throwing out all of copyright law just for the benefit of a few oligarchs would be utter foolishness. Given who is in power right now I'm sure that prospect will find a few friends, but I think the odds of it actually happening before the bubble bursts are pretty small.

Are we not past past critical mass though? The velocity at which these things can out compete human labor is astonishing, any future human creations or original content will already have lost the battle the moment it goes online and gets cloned by AI.
We should, but not for those reasons.

If software and ideas become commodities and the legal ecosystem around creating captive markets disappears, then we will all be much better off.

I'm doubtful the AI companies would be happy with getting rid of laws protecting _their_ intellectual property.
What an infantile worldview.
> Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

https://news.ycombinator.com/newsguidelines.html

OK. To be clear, that wasn't about the OP, but rather the alleged people promoting the abolition of copyright... which would significantly hurt open source.

The people agitating for such things are usually leeches who want everything free and do, in fact, hold an infantile worldview that doesn't consider how necessary remuneration is to whatever it is they want so badly (media pirates being another example).

Not that I haven't "pirated" media, but this is usually the result of it not being available for purchase or my already having purchased it.

There's already been an interesting ruling that a pure AI output is not, in itself, copyrightable.
I'm curious what will happen when someone modifies a single byte (or a "sufficient" number of bytes) of AI output, thereby creating a derivative work, and then claiming copyright on that modified work.
> The AI companies seek to train models in order to compete with the authors of the content used to train the models.

When I read someone else’s essay I may intend to write essays like that author. When I read someone else’s code I may intend to write code like that author.

AI training is no different from any other training.

> If a court determines that the AI output you've used is close enough to be considered a derivative work, it's infringement.

Do you mean the output of the AI training process (the model), or the output of the AI model? If the former, yes, sure: if a model actually contains within it it copies of data, then sure: it’s a copy of that work.

But we should all be very wary of any argument that the ability to create a new work which is identical to a previous work is itself derivative. A painter may be able to copy Gogh, but neither the painter’s brain nor his non-copy paintings (even those in the style of Gogh) are copies of Gogh’s work.

If you as an individual recognizably regurgitate the essay you read, then you have infringed. If an AI model recongnizably regurgitates the essay it trained on then it has infringed. The AI argument that passing original content through an algorithm insulates the output from claims of infringement because of "fair use" is pigwash.
> If an AI model recongnizably regurgitates the essay it trained on then it has infringed.

I completely agree — that’s why I explicitly wrote ‘non-copy paintings’ in my example.

> The AI argument that passing original content through an algorithm insulates the output from claims of infringement because of "fair use" is pigwash.

Sure, but the argument that training an AI on content is necessarily infringement is equally pigwash. So long as the resulting model does not contain copies, it is not infringement; and so long as it does not produce a copy, it is not infringement.

> So long as the resulting model does not contain copies, it is not infringement

That's not true.

The article specifically deals with training by scraping sites. That does necessarily involve producing a copy from the server to the machine(s) doing the scraping & training. If the TOS of the site incorporates robots.txt or otherwise denies a license for such activity, it is arguably infringement. Sourcehut's TOS for example specifically denies the use of automated tools to obtain information for profit.

I'm curious how this can be applied with the inevitable combinatorial exhaustion that will happen with musical aspects such as melody, chord progression, and rhythm.

Will it mean longer and longer clips are "fair use", or will we just stop making new content because it can't avoid copying patterns of the past?

> I'm curious how this can be applied with the inevitable combinatorial exhaustion that will happen with musical aspects such as melody, chord progression, and rhythm.

https://www.vice.com/en/article/musicians-algorithmically-ge...

They did this in 2020. The article points out that "Whether this tactic actually works in court remains to be seen" and I haven't been following along with the story, so I don't know the current status.

More germane is that there will be a smoking gun for every infringement case: whether or not the model was trained on the original. There will be no pretending that the model never heard the piece it copied.
> AI training is no different from any other training.

Yes, it is. One is done by a computer program, and one is done by a human.

I believe in the rights and liberties of human beings. I have no reason to believe in rights for silicon. You, and every other AI apologist, are never able to produce anything to back up what is largely seen as an outrageous world view.

You cannot simply jump the gun and compare AI training to human training like it's a foregone conclusion. No, it doesn't work that way. Explain why AI should have rights. Explain if AI should be considered persons. Explain what I, personally, will gain from extending rights to AI. And explain what we, collectively, will gain from it.

Outcomes matter. Things that are fine at an individual level can become social harmful at scale.
What happens when a culture becomes overwhelmingly individualistic and becomes blind to the at-scale harms?