Hacker News new | ask | show | jobs
by adabyron 1258 days ago
I would like to take the devil's advocate approach & argue the opposite. Not because I believe it but looking to learn.

If I go to your writing, I read it & learn from it. Your writing influences my future writing. We've been okay with this as long as it's not a blatant forgery.

If a computer goes to your writing, it reads it & learns from it. Your writing influences its future writing. It seems we are not okay with this, even if it isn't blatant forgery.

--

I think arguments can be made that computers are different because of their ability for much larger data sets & speed in learning.

I'm not sure it's 100% fair to say a human can learn but a computer that a human uses can't learn.

I am tossing around the idea in my head that this is different because the company is re-using your material to create a product they are going to sell. I'm not sure if I believe that is so different than a human employee doing the same thing.

--

Personal rant, I'm curious how this effects original writing vs duplicated content we currently see. It does seem original writing is rare & we're already drowning in people trying to become social media influencers in their niches by re-using content that's been copied over & over again with very little change.

7 comments

The difference is that humans have a finite work output. 50 shades of grey starts out as Twilight fanfic, but was released as a novel years after Twilight was. Twilight isn't harmed by the existence of 50 Shades.

AI on the other hand can produce books all day every day, hundreds of them nonstop. The minute your book is released it can be fed into a model and have 1000 similar books out within days. It won't matter if yours is better, it'll be drowning in a sea of duplicates.

Monk scribe, meet Gutenberg. Same argument.

Artificial scarcity in the face of a new efficiency has by-and-large never worked in the long run.

> meet Gutenberg

We literally invented an entirely new property right out of thin air just because of the printing press; a human-made right that radically changed almost the entirety of the commercial creative industry and tech industry from that point onward, and radically changed what people are allowed to do today with content that they own, even outside of automated settings: https://en.wikipedia.org/wiki/History_of_copyright

Do you really want to compare AI to the printing press? For better or worse, the law responded pretty hard to Gutenberg.

Did copyright work?

Did it promote the betterment of knowledge in any way?

Did it kill Aaron Schwartz?

Is any of that relevant? My point is that copyright was arguably the most radical legal change we ever made around creative markets; a change with wide-reaching implications for people's rights both to automate copying and around non-automated copying and derivative work. And that change was prompted pretty much entirely just by the automation of a human task that was previously widely accepted to be a natural right.

"It's just more efficient" is not a good argument for deregulation, historically it has often been an argument in the opposite direction.

And I am copyright-skeptical myself. But if your argument for avoiding AI regulation is reliant on convincing people to support copyright abolition, you are not going to convince many people to agree with you. You're basically inviting the space to be regulated; no lawmaker is going to think "this is just like copyright" is an argument against AI regulation. And most ordinary people (even in tech) are not going to agree with you, because most people like that copyright exists. Most people don't look at copyright and think, "this was a mistake."

Looking at the history of the printing press should teach us that the last time somebody made the argument, "we're just doing what humans do, but faster" around creative industries, the law responded, "great point, so we'd better ban humans from doing it too." So just understand the implications of the comparisons you're making; understand that invoking the history of copyright is not a slam-dunk dismissal of artists' concerns.

Yes.

Yes.

No.

(Yes, it protected independent creators from having their work directly monetized by others, like Disney did.)

(Yes, by providing protections for creators to profit from their own works, it motivated some number of people to write, compose, and create who might otherwise not have done so.)

(No, the government did, using copyright as a pretext.)

Note that points one and two do do not suggest that copyright is useful in its current form, rather than its history 14/28-year form.

There's no evidence that lack of copyright stops creation. We had poetry, music, and painting before we even had money. Creativity will out.

There's meaningful evidence that copyright slows knowledge sharing and evolution.

Profitable for a single creator, a problem for mankind.

That's a false equivalency. One is a comparison of methods to mass produce the same text. AI produces very similar but unique works. One enables an author to reach a wider audience, the other kills authors by drowning their works in a sea of similar low quality knock offs.
Those aren't even slightly the same and you know it.
On the contrary, in this analogy, the warehouse scale offices full of low income content mill authors are exactly the monasteries of yore.
The problem is that AI is infinitely scalable. Yes, I can get inspired by your writing, but I won't be able to replace you by churning out huge number of articles. AI can.

Scale changes (and should change) how things are done.

So when something becomes too efficient, it should be banned in order to save existing career paths?
I commented similarly elsewhere, but we already do this with robots.txt and bot detection.

What, I shouldn't be able to visit a website if I'm too efficient and fast at visiting it? I shouldn't be able to buy tickets to an event or preorder a device if I'm faster than the person next to me?

I don't think AI should be banned, but I don't find the "it's just more efficient" argument particularly compelling because there are a ton of examples in the real world of us banning (either legally, socially, or technologically) automation purely because that automation is more efficient than a human being; everything from automated website access, to game botting, to pre-ordering, to anti-spam measures on commenting platforms. Efficiency/scalability compared to basic human ability is a very common metric for us to use to determine whether a technology is "good" or "bad".

I love that list because, in my view, each of those examples is "doing it wrong".

Even game bots, where humans and robots compete head to head directly, should be able to be managed with a ladder system where the bots (representing skills of the self-force-multiplying humans behind them) end up at some level individual humans can't attain, so plain humans are left battling each other while automation humans' proxies battle in their own echelon.

This would clarify a few things, such as, why a automator's hourly rate probably should be much higher than a piecework toil rate.

I've argued in the past that humans do have a right to automate[0]. And I've commented plenty of times on HN that the games industry responds to cheating in the wrong ways and that botting in games is symptomatic of design flaws more than it is a technical problem that should be handled with invasive anticheat.

But I also recognize that both of those positions are extreme minority opinions.

There are lots of things we do that could be handled differently, but unfortunately the current structures we've built in society don't handle them differently. Automation is one of those areas. We don't really have a good way of handling automated attacks without targeting automation, even though there arguably are ways we could do so. And for the average person on the street and for the average person on HN, efficiency is an extremely good argument for regulating access.

And while I lean in the opposite direction, I also understand and sympathize with the practical realities that lead people to that opinion. It's all well and good for me to tell people to get rid of captchas, but I don't have a similarly simple system to hand them today that will help prevent automated attacks.

----

I'll add onto that point that when we talk about efficiency of botting, scraping, scalping, etc... nobody says, "tough luck, get with the times." The arguments against captchas and human tests and invasive software argue that we can address the problems without that invasive stuff. Nobody argues that the problems don't exist.

So that's another difference I see with concerns around AI training on copyrighted material. Nobody responds to public ladders in a video game being swarmed with bots by shrugging, they try to offer solutions. In contrast, people do respond to concerns about AI flooding public galleries and overwhelming moderators or cloning existing artists by shrugging and saying it's not a problem. That feels a little inconsistent to me.

[0]: https://anewdigitalmanifesto.com/#right-to-delegate

Fully in sync.
In this case it's only the replication that got more efficient. The artist is not automated out of a job, only out of the fruits of his labour.
I'm totally chill with humans reading and stealing from each other. I'm not ok with cooperations automating and capitalizing on it.
So - with me as an individual and not a corporate entity - you're OK if I generate AI derived from your creative output?
If you run the algorithm entirely on the pile of fat in your skull, YES!
But that's very different to what you said previously:

> I'm totally chill with humans reading and stealing from each other. I'm not ok with cooperations automating and capitalizing on it.

Where do you draw the line? I've used AI in very subtle ways in creative projects. Would you object to 1% use of AI? 10%?

Or are you saying "no training at all no matter what the end-use will be"?

There isn't a hard boundary between "somebody messing around, learning, and making art" and "capital holders stealing from everyone" but my real problem is the latter state of affairs, AI isn't really relevant to the morality of the situation just a means to an end. A situation where they use massive volumes of labor to the same ends would have similar issues (ignoring the obvious human rights violations). The only problem AI has is that it makes abuse more efficient.
I don't think "fair" is a good word here. I'm happy to work from the basic principle that I want to society to be set up to make people's lives better, and I don't care about AIs, and I really don't care about AIs designed by multi-billion dollar companies.

Unfortunately, I think this isn't a battle that can be won, as those multi-billion dollar companies can just buy laws that say they can slurp up everyone's information and train on it (such laws are currently being introduced in several countries).

I think the difference is speed and throughput. When a human digests writing or artwork, even to produce derivative work, it's on the scale of hours, if not days. In the time that a human can produce 1 replica, an AI can produce thousands of works and deliver them to an audience of millions. And through the sheer network effect of the internet, it'll be impossible to remove any of these works, as they'll be replicated and reposted over and over again.
for me the biggest difference is the fact that the AI can be commercialized without the need for any acknowledgement. It could not exist without the "inspiration" as you call it. But at the same time its throughput allows for massive gains of the person owning the model. So i would argue robots.txt is the wrong analogy. We rather need a licensing system for content. Apache, MIT, GNU etc for Blogentries, Pictures, Videos etc
If people want to argue that AI learns like a human, then creative rights should belong the AI model. If people want to argue that AI is a tool, then they should take those into consideration with their training data. You can't have your cake and eat it too.

That's completely orthogonal to copyright terms being way too long though.

What do you think about people deliberately making models based on an artists artwork (SamDoesArt) in order to deliberately copy their work and antagonize them? They've explicitly said that is their motivation too.