Hacker News new | ask | show | jobs
by gojomo 1401 days ago
As I see it, 3 of the 4 tests are strongly in OpenAI's favor; the 'market effect' is mixed.

(1) The use is highly transformative;

(2) the images used were offered to the anonymous browsing public (with watermarks);

(3) the end effect of training will only retain a tiny spectral distilled essence of any individual photo, or even a giant source corpus;

(4) there's a potential risk of market competition from the ultimate model output, for some uses – but that's also the most 'transformative' aspect.

Getty et al could potentially just ask creators of such models not to include their images – perhaps by blocking their crawling 'User-Agent' – and it might not make any real difference in the models.

4 comments

I'm still not seeing the "transformative" argument: the point of transformation isn't "it is in a different format" but (to quote Wikipedia, which is, of course, dumb... I'm sorry ;P) where one "builds on a copyrighted work in a different manner or for a different purpose from the original". The reason a search engine thumbnail is transformative isn't because it has been transformed to make it smaller... it is because the purpose of the resulting use of the image is somewhat unrelated to the use the original author was going for when they made the original image. At issue here is then that, rather than using an original image from Getty Images, someone decided to take all of the images from Getty Images and churn them through some algorithm that generated an image that directly competed with the original images from Getty Images. So like, sure: if you really only narrowly want to talk about OpenAI, what they are themselves doing (training and distributing a model) might potentially be legal, but the people using the result would seem to be in serious hot water... oh, and actually, I think they run it all a service, don't they? So no: I don't even think that defense works, as OpenAI is in some sense not even selling a model, they are merely directly competing with Getty Images to provide sell photos to people.
Autogenerated, often fantastical, never-seen-before AI images strike me as a paradigmatically 'transformative' use. It's novel. It's shocking to many practicioners how flexible & high-quality the images can be. It will unlock all sorts of new downstream creation.

The representation that feeds the generation is statistical, even to the point of being plausibly factual: these things/people/places/concepts can be abstractly represented as the balanced weights inside the model. And under US law, facts aren't copyrightable.

I could see a case being factored as: (1) the scraping/training/ephemeralization itself involves the usual copying of downloading/locally-processing images, like indexing, but all those 'copying' steps are fair-use protected, as science/transformative/de-minimus/whatever; (2) any subsequent new-image generation no longer involves any 'copying', only new creation from distilled patterns of the entire training corpus, in which Getty retains no 'trace tincture' of copyright-control. So there's no specific acts of illegal copying to penalize.

Also, a human artist would be allowed to review related Getty/etc preview images, free on the web, to familiarize themself with a person or setting, before drawing it themself, with their own flair – as long as they don't copy it substantially. Why wouldn't an AI artist?

"AI artist" doesn't add any of its "own flair". It builds exclusively on past experience and work of humans. And it also directly completes with them without any thought of credit or compensation.

People are really underplaying how damaging this is going to be for the industry. It's going to completely decimate it. You can already see people using names of artists in the DALL-E prompt to get "their" work for few dollars avoiding any copyright or social issues.

Artists will suddenly be competing with AI on price and time - why we should pay you living wage when we instantly generate something close enough.

Why would anyone try to create some new aesthetic or push anything further if their effort will be replicated next week when the model gets updated with new source data. Everything is gonna get stuck to aesthetic of 2025 and before.

It's completely inhuman.

The synergistic effect of all the AI's inputs absolutely results in a unique new 'flair', with extensions, reversals, and mash-ups of styles just as in human-made artistic styles.

And AI "builds exclusively on past experience and work of humans" just like any young new human artist equally does. In many cases, you can even tell the different models' outputs apart, not by raw quality or glitches, but by hard-to-describe aesthetic tendencies.

I share your concern on the effect on human artists – both the market for their work, and even their morale, when learning, knowing that decades of practice will still be outproduced by seconds of computation.

But I don't think the genie will be put back in the bottle, by either expansive interpretation of existing copyright law, or even new laws.

Indeed the genie is out. And while we will get some interesting AI uses ultimately this is degenerative tech. In the end we end up with less authentic, less unpredictible and less delightfull art. Instead we get the perfectly suited to us, predictible, mediocre stuff.

I said it in comment above - yes people build on work of others but they also bring lots of their originality and intelect. Part of what people do is truly uniquely theirs and piece by piece we progress as a whole.

The crutial detail is that AI learns only from visual patterns from past and cant think at all. And humans learn from everything around them and think about it deeply.

I don’t believe we will lose the capability to create new original styles. If a prompter can describe the creation of a new style, the AI can create it. Using both iterations of image & text prompts, unique styles will come.

The thinking is still done by the human prompter.

One could make the same case about humans, nobody works in a vacuum. Even though he used it in a pejorative sense, Sir Isaac Newton, the famous English scientist, once said, “If I have seen further, it is by standing on the shoulders of giants.”

That humans are capable of developing their own style could still be argued that it's just a intermixing of previous work that they've seen, but they've combined it in a different way, which effectively is exactly what these generative systems do.

Of course humans build on work of other people. And what they do is partialy a mashup. But their work is not only replication of visual patterns. Its thinking its other non visual experiences its their politics and world views combined in their work. Often its their life project.

To think that artists only mash up what was before them is quite obviously wrong.

But its exactly only thing the tech does.

I'd argue that if an artefact such as a watermark is copying even more substantially than any other human would and that human would at best be labelled as unoriginal, or doing very derivative work or be in violation of copyright.
Perhaps I’m misunderstanding your argument, but my counterexample would be: if a human digital artist transformed a Getty image, resulting a fantastical, never-before-seen result, using software like Photoshop, that use would be no more defensible. If anything, the vast scale at which this occurs in AI makes it worse.
I think your hypothetical would depend on the character & extent of the transformation. Mere filters that leave the original recognizable? Probably an infringement. But creative application of transformations to express new ideas? Maybe not – especially if the derivative is a comment/parody on the original, that actually increases interest in it. Most art is a conversation with the past, reusing recognizable motifs & often even exact elements.

For example:

Andy Warhol died in 1987, 35 years ago. One of his 'Prince' collages dating to the early 80s used another photographer's photo, without permission. In 2019, one federal judge ruled that was not infringement. An appeals judge then said it was.

The Supreme Court has decided to take the case.

The US Copyright Office & Department of Justice agree with the photographer in briefs filed with the court... but the mere fact the Supreme Court took the case indicates they think there might be issues with the appeals court ruling. They might agree with the original judge!

Oral arguments come this October. See:

https://www.reuters.com/legal/litigation/us-backs-photograph...

So, when all the (possible) disputes over AI-training-on-copyrighted-images resolve – maybe in the 2030s or 2040s? – what will the laws say, & courts decide? It'll depend a lot on other specifics, & reasoning, that may not be evident now.

Thanks, that is a thorough and interesting reply.

I find legal disputes in fine art interesting, however—IANAL, of course—I understand that fine artists (Richard Prince comes to mind) are subject to very different copyright restrictions than graphic artists under commercial use.

It’s, as you said, up to courts to decide. But AI generated imagery is frequently commercial in nature (KFC, already). AI services are trained on unlicensed commercial stock images, and are able to reproduce enormous quantities of derivative images, and do so at a profit. I think that’s categorically different from a fine artist appropriating imagery in a single artwork or even series of artworks in an entirely different context.

These AI generated images are directly competing with stock images. AI tools are selling images to blogs and other customers that often would purchase stock images instead.

The "character of use" is not in favor of dall-e, it is a commercial use.

Copyright law does not require getty to block a user agents or ask them not to include their images.

Another issue here is that removing copyright management info like a watermark is a violation of the DMCA, separate from fair use or copyright infringement. These cases have statutory damages and attorneys fees awarded.

Whether something is directly competing for the same business would have to be evidenced, and copyright doesn't mean protection from all possible competition - it's just one factor weighed. And fair use protects many commercial uses, too, depending on proportion/character-of-original/etc.

But also, none of these images are direct, or even necessarily subtantial, "copies" of other images. The generator learned from other images – the same as any human artist might.

No watermark has been removed; the bigger issue may be that the spectral watermark violates a trademark. (But, I doubt consumers are likely to be confused.)

"The generator learned from other images – the same as any human artist might."

A lot of people seem to make this comparison, but I don't think it's fair. It's wrong. A computer is capable of ingesting/processing and "learning" from images at a rate no human can possibly come close to matching. To elaborate, it is not actually learning in the way we normally think of it, as its "brain" is completely different from a human's brain. It is doing something entirely different that should have its own word. Human artists learn from other human artists' work. An AI does something else.

It's also worth noting that the art the AI was trained on was posted online when the technology didn't exist (or if it did in some form it was not in the state it is in now). So an artist having posted their art online for public consumption can't be equated with somehow consenting to its consumption by a web scraper / AI.

It's great that human artists learn from, & introduce into their work, influences other than just patterns seen in other works.

But it's also great that AI artists can learn from more examples in a few minutes than a human artist might see in lifetime.

To say that's "not actually learning in the way we normally think of it" is superficially true, but it doesn't mean it's "not actually learning", or necessarily any worse than typical learning. It's so new, & we barely understand fully how it works or what its limits are. It might be better in many relevant & valuable aspects!

Fair, I don't know what it's actually doing. I just know you can't equate it with anything a human does, and the use of the word "learn" is misleading, or vastly oversimplifies what is happening, to the point that it allows for false analogies.

That said, my main objection to this technology is that:

- The AI's work is based on human artists' work

- Companies are then profiting off of the AI's work

- The companies are indirectly?/directly? profiting off of artists' work

- The companies do not get artists consent or compensate them in any way

- The companies are essentially stealing from artists

Companies should be forced to obtain the creator's consent when using art to train their models.

It’s going to be interesting what the stock companies will do. Maybe they will make their own Image Generator. Perhaps we will see a case based on the new factor that is AI. An AI is not artist; they can’t be conflated. A decent artists can churn out maybe 5-10 works if he is productive. AI can churn out by the hundreds or thousands if needed. The process also isn’t the same.

Anyway it will be interesting to watch this space.

AI generated images cant be copyrighted.
Given the iterative contribution of a artistically-talented human prompter, I'm not sure that precedent – set by the Copyright Office in the US, rather than a clear statute or court decision – will hold up. A court might decide differently, or a statutory update could overrule the copyright office, especially in cases where an individual output is the mix of human & AI effort.
I have a hard time agreeing with 3, given https://ibb.co/DzGR063
aside from if it is not copyrighted the image, the Getty watermark usage probably might have a bunch of issues.