|
|
|
|
|
by mtlynch
1132 days ago
|
|
>Open source code on GitHub might be thought of as “open and freely accessible” but it is not. It’s possible for any person to access and download one single repo from GitHub. It’s not possible for a person to download all repos from Github or a percentage of all repos, they will hit limitations and restrictions when trying to download too many repos. (Unless there’s some special archives or mechanisms I am not aware of). There actually is a convenient archive for accessing GitHub-hosted code in bulk. All GitHub source code is available for bulk analysis in Google BigQuery. https://cloud.google.com/blog/topics/public-datasets/github-... I still don't support GitHub training Copilot on other people's code without permission, but this particular part of OP's argument is incorrect. |
|