Hacker News new | ask | show | jobs
by vlthr 1452 days ago
I have a really hard time understanding what future world this article (and other detractors) is arguing for and why that world is made better by taking their arguments seriously.

On the practical level I agree with the part advising caution to those that might end up embedding an identifiably licensed snippet in their codebase via copilot. I also agree that copilot users plagiarizing significant chunks of GPL code for profit is immoral. This needs to be prevented.

I also share the frustration stemming from big companies leveraging their disproportional access to data and resources for profit given that the greatest value of these models is precisely the open source code it is trained on.

Ultimately though, what I care about is the potential for building better tools. LLMs potentially offer paths towards genuinely new forms of human-machine interaction, and I don’t want that exploration to be suffocated by legalism.

1 comments

If one believes in the rule of law, surely it can't be true that Microsoft (and others) should get to enjoy all the benefits of intellectual-property laws (when convenient) and other times steamroll them (when convenient).

Wealthy corporations are never going to be "suffocated by legalism" — they can afford to do their research privately. (And many still do.) The issue here is that Copilot is being foisted into the agora seemingly without sufficient (or maybe any) scrutiny of its legal consequences.

More broadly we are seeing a norm emering where there is so much hype chasing AI that these wealthy corporations (see also: Tesla, of course) have a huge incentive to push their experiments into the public sphere prematurely, simply to assert their primacy.

BTW this technique of front-running regulatory scrutiny can still backfire. If the initial public experience with an emerging technology is sufficiently bad, it can poison the acceptance forever. IOW, you can be suffocated by your own hubris faster than any external legalism.

I’m definitely not worried for Microsoft or the other big tech companies developing copilot-like products. To the extent that legal blowback focuses on issues that are both impactful and solvable (e.g. plagiarizing non-trivial snippets), they should be held to a high standard. Your point about the risk of poisoning the public’s acceptance of these technologies also resonates with me.

What worries me the most is the effect the public backlash towards these big companies can have on smaller actors that could enter this space in the near future. In the past we’ve seen open source projects like GPT-J come together to fund and reproduce closed models, and if we’re not careful to be nuanced in our criticism of big-tech frontrunners we might end up poisoning the waters enough to deter small actors without a dedicated legal team.

Copyright law is ultimately designed around humans as the only kind of actor. In an ideal world we would sit down and think about the way non-human learners should fit into this system and the balance of tradeoffs we want those laws to aim for. I hope that happens someday, but until then I hope we can cultivate a world where small actors are able to experiment with these technologies without fear of legal action.

That’s why it bothers me to see people arguing that language models should be thought of like human programmers making derivative works, even suggesting that we should require attribution for all generated outputs (i.e. the entire training set, always). That helps nobody, except of course big companies with infinite manpower.