|
|
|
|
|
by lindenksv1
1318 days ago
|
|
Kate Downing here. This is an excellent question. So, just like YouTube, GitHub would likely argue that they are protected by the DMCA and that so long as they comply with DMCA take-down requests, they are not liable for copyright infringement (direct or indirect) for third party content posted to GitHub by people other than the copyright owners. Remember that the DMCA effectively shifts that due diligence you speak of away from providers of online services and onto copyright holders themselves. Without the DMCA, many businesses that rely on user-generated content just wouldn't exist because that due diligence isn't possible at scale - it's often not even possible for individual pieces of content because the publication of any copyrighted work can be very obscure and because in the US you can hold a copyright without formally registering it. In practice, I think the entire open source world knows that people post each other's open source code on GitHub. Even projects that have very purposefully chosen to primarily use other services or self-host their source code are well aware that their code gets mirrored on GitHub and/or included in other people's repos on GitHub. Up until now, I don't think this has been controversial and I don't think GitHub gets a lot of takedown requests for this practice. I think most developers see this as a feature, not a bug. Copilot might make people rethink whether or not they want to start sending take-down requests but that'll be a tough call for a lot of people because withholding code from GitHub to avoid its usage in Copilot also effectively means making their code less easily available to the rest of the world. It may be very disruptive to other projects that include the copyright owner's code in their own projects. |
|
If my code was uploaded on GitHub, I would DMCA it because of Copilot, but it wouldn't matter because the information is already in the model. So the DMCA does not help here.
The only way it would help is if I could DMCA the entire model and force them to retrain without my code. As it stands, this lawsuit is the only way for GitHub to be reined in; I don't have the resources to do so on my own.
IANAL.
Also, about high impact, suppose Copilot has 1 million users that use it on average 10 times a day, 5 days a week. You claim that less than 1% of uses of Copilot would result in copyright violation. Let's assume 0.1%. How many times would copyright violation happen per day? It would happen 10,000 times per day. For five days a week.
It would take a mere twenty weeks (less than six months) to reach a million violations.
That seems impactful.