Hacker News new | ask | show | jobs
by bryanrasmussen 1819 days ago
>What happens when it verbatim outputs a significant amount of code that was originally MIT, or BSD, or GPL licensed without the appropriate attribution

You would sue.

And then Github would argue that their algorithms did not spit out verbatim the code by copying but rather it generated code that looked exactly like the other code based on learning from millions of codebases. ยจ

And then there would be lots of lawyers.

And then a judge would have to decide.

2 comments

And the judge would really not care about the "we did not copy it, we made an algorithm that created the exact code" technicality. It's their job to see through such things and consider the case at a higher level.

So the judge would look at two pages of exactly the same code and then decide whether the "not really copied" part is big enough to be considered an original work or not. If it is big enough it is a copyright violation. Nobody cares that you used an algorithm in between, you took the original as an input and ended up with exactly the same thing as an output, copyright violation, case closed...

But it would still potentially have used code not licensed for commercial use as a data set for a commercial product, which is problematic.

GitHub really needs to clarify which code was allowed for inclusion here. Until then we're can only speculate And enumerate potential scenarios.