The criticism is exasperation at those who feel entitled to make money off other's copyrighted materials, regardless of what title they might hold. Chat-GPT, Co-Pilot, and Stable Diffusion are algorithms; they are not entitled to learn from any material they want just because people anthropomorphize them and make baseless claims like "they learn just like people do!"
Opt-in would hugely limit the amount of material - possibly to the point where AI research becomes unviable for many applications.
This would impact pure research and other use cases which are likely to be of benefit to society as a whole.
Copyright was created with clear legal limitations (albeit that those limitations are often being eroded by corporate interests).
The "natural" state of man is without copyright and it's imposition isn't a moral right - it's a legal trade-off that should carefully weigh up cost vs benefit.
> Opt-in would hugely limit the amount of material - possibly to the point where AI research becomes unviable for many applications.
Not necessarily a bad thing. If your tools require widespread breaking of existing laws, your tools are broken.
> it's a legal trade-off that should carefully weigh up cost vs benefit.
Cost: Loss of millions of jobs which are suddenly invalidated by the loss of copyright. Less content overall being created.
Benefit: AI can freely consume what content is left.
I think the cost is far too high.
EDIT: I'd also point out that the "natural" state of man is no laws at all - ownership of goods is enforced with only your strength of arms. I really don't want to live in that kind of world.
Copyright (see the copious evidence that training is not respecting copyrights or licenses across the latest commercial machine learning algorithms).
The same law you brought up? Laws which are written into national treaties?
Regardless of your beliefs about copyright and how it should be changed (or abolished), it is not just the law of the land, it is the law of the world.
They are indeed indulged into the AI hype and are hardly contributing anything other than trampling over the copyright of digital artists and running everything to zero.
There is nothing fair use about training on a copyrighted image, code, music and have the model output contain 90% of the original training content verbatim and call it 'AI'.
I'd think it's more comparable to compression. No one would argue that a jpeg or mp3 file can't contain a reasonable representation of the original because it is smaller.
Are you criticising the entire field of study - or just some uses of it?