I sometimes worry that qualms about copyright & ethics will make us lose the machine learning arms race with China. If « The unreasonable effectiveness of data » still holds true then we are in big trouble.
Don't worry, in China they need to worry about a different set of "ethics". It's not just your LLM should never speak against Xi. They probably also need to include a big propaganda corpus so that the models are aligned with CCP's value, which is very likely to be bad for reasoning capabilities.
Of course they aren’t the customers. That doesn’t stop them from having a disproportionately loud voice in the matter.
Once you really understand what ESG is, you will understand why every company gets a rainbow logo this month, why companies seem eager to push their actual customers aside, and why the constantly-offended are catered to instead of mocked.
Look at every AI press release. They all use “safety” over and over. Without that dogwhistle, they wouldn’t be in line for investment funding.
No need to worry about it happening, it already has. In terms of diffusers alone:
Stable Diffusion 3 came out recently and immediately fell flat on its face. Terrible anatomy and basically unusable for human forms unless you use so many “unsafe” negative words that it’s “safety” training doesn’t destroy the result.
In the meantime… Chinese models Lumina, PixArt, and Hunyuan are all Chinese projects or heavily contributed by Chinese researchers and companies. These are rapidly gaining steam. They’re far less “safe” with far less lobotomization.
We are already losing the AI race in this realm exactly because of an over-reaction to “safety”.
1. “AI is going to (help someone) take over the world”. We don’t need this yet but it’s better to think about it before you need it rather than afterwards.
2. AI is going to do genuinely bad things: generate CP, teach people to make dangerous chemicals.
3. AI is going to do things that people in California think are dangerous: generate a picture of a boob, generate picture of a group of people who are not a mixture of different ethnicities, or sexes, give an answer to a question that isn’t aligned with how people in California think.
China is not concerned with any of these, although it is concerned with its own version of point 3:
3b. Generate criticism of the CCP or anything that the CCP doesn’t like people to know about.
You think OpenAI, Meta or Microsoft care about those things? They've already trained on all the world's copyrighted data without asking for permission. Any hand-wringing you see from them is performative.
Why should AI need permission to perform the equivalent of "human looking" at content? We humans don't need permission, why should AI?
I could paint a picture of a Disney character after looking at nothing but public promotional material already out in public view, online or in print etc. The tools I use to paint with are capable of all the same colours and lines used in Disney characters. We don't blame the tools for that, but we seem to be blaming AI for being too good?
Selling a picture you made of a Disney character is IP infringement. If the work you create is substantially derivative and competes against the IP holders work (which is pretty arguable given the prevalence of prompts containing the names of specific artists/characters/companies), then it is infringement. The same would be true if I paid you to produce art/merch/etc. of a Disney character.
I didn't mention selling the picture I made of a Disney character. I mentioned only drawing it.
Likewise, AI can draw the Disney character. You as prompt author may have no desire to sell or share the drawing. It's your karma if you choose to cheat, deceive or steal in any areas of life.
I am not a fan of AI-gen "art". But I object to pitchfork policy, banning and crippling AI from training on actual things in the world - IP protected or otherwise. I think AI should see everything we humans publish, so it knows us better than if we limited its learning.
> I didn't mention selling the picture I made of a Disney character. I mentioned only drawing it.
If you are paying for an AI service to produce art based on your prompts, then they are selling the art to you, rergardless of what you do with the image afterward. It's no different than going to an artist, describing an image you want, and paying them for what they produce. Generating images with propmts doesn't make you the artist, it makes you the commissioner. And if that commission involves a copyrighted character, then you have a legal issue.
> I think AI should see everything we humans publish, so it knows us better than if we limited its learning.
In an ideal world, I would agree. Unfortunately, the pressures of capitalism change the circumstances somewhat. As evidenced by the board shakeup, OpenAI doesn't care about lofty goals of bettering humanity, they want to make money. They built a plagiarism engine that will steal and sell the work of millions of artists, putting them out of business. In a better world where art production was not tied to their ability to feed themselves, then fine. But in this one, we're destroying the livlihood of millions to make some tech bro theives richer and flood the world with shitty "art". We have to consider the impact that technologies like this will have on society, given the way society is currently organized.
What if we flip the script - would it be cool if Disney scraped DeviantArt and created merchandise, movies, etc. based on the art and characters of small creators? Rules need to exist to protect the little guy.
Technically a lot of the merch people sell at conventions is infringing, but Disney, Nintendo, etc. recognize that pursuing action against them would ultimately be worse for their brand.
Generally speaking, whether a fandom existing around a creator's work is good is ultimately up to the creator. If the fandom economy chokes out the product of the artist, then the artist should have legal recourse (this is effectively the same scenario as counterfeits / knock offs). But if the creator's work is able to flourish alongside the fandom, then the creator may well decide that keeping the fandom engaged is worth more than being a legal prude. The protection should exist legally, and the creator should be allowed to exercise it as they deem necessary.
> would it be cool if Disney scraped DeviantArt and created merchandise, movies, etc. based on the art and characters of small creators?
You've more than flipped the script. You've introduced a global corporation as the "prompt author", feeding off the work of small creators. It's not a fair equivalence because suddenly the "prompt author" has major distribution partners, reach, budget and branding to propel their deception around the world.
Where exactly are all these plagiarising AI-gen artists anyway? Professional concept art involves highly intentional detail, themes and thinking. AI-gen renderings are more like rolling a dice and saying "cool! check this out!" It's not the same game.
> Why should AI need permission to perform the equivalent of "human looking" at content?
Why should a car need permission to perform the equivalent of "human walking" down the sidewalk? A car converts fuel into kinetic energy and uses friction to propel itself forward, just like a human does.
The thing is, we don't refer to what cars do as "walking" because the difference is very visible to us. The difference is not so visible with ML algorithms, so people keep trying to compare them to humans, when they're not.
We need a new vocabulary to describe what these ML algorithms are doing with data.
So... You're likening AI to cars that need registration to drive on roads; in other words AI needs permission to train on the world's content. I like this analogy.
You had me stumped for a minute. But whenever a car drives from A to B, the "damage" is done. Fuel spent, occupants transported, pedestrians given way, road surfaces slightly worn. When AI does its training thing, no damage has occurred. There is only potential damage later if someone decides to misuse what AI has learnt. I wonder if this makes me an optimist. I want AI to know more, because it will be better for us humans when we use it responsibly.