Shoot dude, the engineering organization I mentor/teach at a high school has ~75 internal repos.
Robot source code; satellite ground station hardware; satellite ground station software; visualization; satellite hardware; satellite software; nuttx + its submodules for 2 different projects; linux kernel fork; circuitpython fork; raspberry pico tools fork; embedded programming/debugging tools; my lecture notes; my automated grading tooling; etc etc etc. That's just me + ~35 students in classes.
Pretty easy to see how when you have scale you can get to a few thousand.
It's normal that a dev has *access* to all the code.
But did he clone all the repos into his machine? I doubt it. So, the hacker extracted all the 3800 repos using the employee's machine as a gateway? I doubt it as well, I'm sure they would have detected this huge amount of data much earlier than transferring all of it?
> The real question is why github has 3800 internal repos.
They’ve been developing git and GitHub for over a decade. It really isn’t surprising they have made thousands of internally available repos. They probably have hundreds just for running automated tests alone.
Depends how it's set up. Many companies add an IP address check so if you don't come via their VPN (or are not in the office) the connection will be rejected before any auth is asked.
So you'd need to authenticate for the VPN, which often has 2nd factor.
Robot source code; satellite ground station hardware; satellite ground station software; visualization; satellite hardware; satellite software; nuttx + its submodules for 2 different projects; linux kernel fork; circuitpython fork; raspberry pico tools fork; embedded programming/debugging tools; my lecture notes; my automated grading tooling; etc etc etc. That's just me + ~35 students in classes.
Pretty easy to see how when you have scale you can get to a few thousand.