|
|
|
|
|
by atonse
2170 days ago
|
|
This is so awesome, but the most surprising to me is that all the public source code on GitHub only totals 21 TB. I forget that they do fundamentally host text, and not video etc. I somehow thought it would be petabytes. The private repos might be more than that but those are historically paid. |
|
Even a naive deduplication might yield some very interesting results
Reminds me of a time I caught someone using someone else’s code in an interview and passing it off as their own. (Using was fine, it was the claim that it was theirs that bugged me)