Yeah but it’s the same issue. Open source licenses (just like other laws) weren’t designed for the age of LLMs. I’m sure most people don’t care, but I bet a lot of maintainers don’t want their code fed to LLMs!
Intellectual property as a concept wasn't designed for the age of LLM. You have to add a bunch of exceptions to copyright (fair use, first sale) to get it to not immediately lead to scenarios that don't make any intuitive sense. LLMs explode these issues because now you can mechanically manipulate ideas, and this forces to light new contradictions that intellectual property causes.
I agree that commercially operated LLMs undermine the entire idea of IP, but it is one of the problems with them, not with the concept of intellectual property, which is an approximation of what has been organically part of human society motivating innovation since forever: benefits of being an author and degree of ownership over intangible ideas. When societies were smaller and local, it just worked out and you would earn respect and status if you came up with something cool, whereas in a bigger and more global society that relies on the rule of law rather than informal enforcement legal protections are needed to keep things working sort of the same way.
I doubt anyone would consider it a problem if large-scale commercial LLM operators were required to respect licenses and negotiate appropriate usage terms. Okay, maybe with one exception: their investors and shareholders.
> IP is an approximation of what has been organically part of human society and drove innovation since forever: benefits of being an author and degree of ownership over intangible ideas.
It is not! It's a very recent invention. Especially its application to creative works contradicts thousands of years of the development of human culture. Consider folk songs.
> I doubt anyone would consider it a problem if large-scale commercial LLM operators were required to respect licenses and negotiate appropriate usage terms. Okay, maybe with one exception: their investors and shareholders.
And the issue I'm gesturing at is that you run into different contradicting conclusions about how LLMs should interact with copyright depending on exactly what line of logic you follow, so the courts will never be able to resolve how it should work. These are issues can only be conclusively resolved with writing new laws to decide it's going to work, but that will eventually only make the contradictions worse and complicate the hoops that people will have to jump through as the technology evolves in new ways.
> Especially its application to creative works contradicts thousands of years of the development of human culture. Consider folk songs.
First, let’s note that creative work includes a lot more than just arts (crucially, invention).
In music, by your logic you may disagree with recognising song composition as IP, but you have to agree that being able to earn royalties from businesses playing your performance (even if it is a cover) serves as a proxy to people coming to listen and express their appreciation to a performer back when audio recording didn’t exist.
Also, let’s distinguish IP in general and its current legal implementation, such as protections lasting longer than author’s life. It should be noted that complexity in art did also grow since then, but it may or may not (I have no strong opinion here) make sense to grant the author post-humous protections.
> you run into different contradicting conclusions about how LLMs should interact with copyright depending on exactly what line of logic you follow, so the courts will never be able to resolve how it should work.
The courts can identify which ways of LLM use follow the spirit of the IP framework, encouraging innovation and creativity. As it is, current commercial LLMs slowly erode it, creating a feeling of “nothing belongs to anyone in particular, so why bother putting in the hard work”, profiting a minority of individuals while harming society over longer term. It is not difficult to see how applying the copyright as is could put an end to this, ensuring authors have control over their work, with the only consequence being slightly worse bottom lines at a handful of corporations with market caps the size of countries.
Yes. But here we are, people ignoring all the theft that has happened. People generating images on stolen art and call themselves artists. People using it to program and call themselves programmers. Also, it seems to me that so many people just absolutely ignore all the security related issues coming with coding agents. Its truly a dystopia. But we are on hackernews so obviously people will glaze about "AI" on here.
Maybe we should get upset about people using cameras to take pictures of art on the same principles. And what about that Andy Warhol guy, what a pretender!
… so I hope you can see why I don’t actually agree with your comment about who’s allowed to be a artist, and not just dismiss me as a glazer
Who is taking pictures of art and calls themselves artist for that? People are generating images from stolen art and creating businesses off of that. People are faking being an artist on social media. But I shouldn't be surprised that people with no actual talent defend all of this.
Intellectual property theft? If gp’s referring to the Books3 shadow library not having been legally bought, it’s not realistically more than 197k books worth less than $10MM. And let’s not forget Intellectual property rights only exist “ To promote the Progress of Science and useful Arts.”
There's certainly some debate to be had about ingesting a book about vampires and then writing a book about vampires.
But I think programming is much more "how to use the building blocks" and mathematics than ingesting narratives and themes. More like ingesting a dictionary and thesaurus and then writing a book about vampires.
In my experience AI coding is not going to spew out a derivative of another project unless your objective is actually to build a derivative of that software. If your code doesn't do the same or look the same it doesn't really meet the criteria to be a derivative of someone else's.
I mostly use Cursor for writing test suites in Jest with TypeScript, these are so specific to my work I don't think it's possible they've infringed someone else's.
> unless your objective is actually to build a derivative of that software
The objective of a for-profit corporation may well be to build a derivative of free software, benefitting from the work volunteer engineers put into it. Previously, if that software is GPL, it would imply a clean-room reimplementation of a massive codebase. With LLM laundering, relevant companies could as well simply claim “we got this from copilot” and they would be right (note that they don’t need to have used an LLM—the mere legality of these license-laundering LLMs means you can simply copy this from a GPL codebase and claim that an LLM output it due to its non-deterministic nature). This goes contrary to the promise of copyleft licenses that volunteer contributor work will remain to benefit the public and could not be expropriated this way, which led to OSS explosion in the first place.