| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by joe_the_user 1817 days ago

Beside that, I cannot understand why the overall idea, using open-source project to train a ML model that generates code would ever be a problem. We human beings are learning as the model, we read others code, books, articles, design patterns... and it becomes part of us.

It's an interesting question.

1) When a human being reads code or a CS text book, we think of them extracting general principles from the code and so not having to repeat that particular code again. In contrast, what GPT-3 and Copilot seem to do is just extract sequences of little snippets, something that apparently requires them to regurgitate the text they've been trained on. That seem rather permanently dependent on the training corpus.

2) Human beings have a natural urge, a natural ethos, to help people learn. It's understandable. The thing is, when suddenly you're not talking people but machines, the reason for this urge easily vanish. Even if github was extracting knowledge from the code, I wouldn't have a reason to help them do so since that knowledge would be their entirely private property. They expect to charge people whatever they judge the going rate would be - why should anyone be helping them without similar compensation? That this is being done by "OpenAI", a company which went from open-nonprofit to closed-for-profit in a matter of few years, should accent this point. We're nowhere near a system that could digest all the knowledge of humankind. But if we got there, one might argue the result should belong to humankind rather than to one genius entrepreneur. And having the result belong one genius entrepreneur has some clear downsides.