Hacker News new | ask | show | jobs
by scifibestfi 1241 days ago
That's huge. What takes up most of that 40Gb+?
3 comments

1. It is basically a regional big tech do-it-all company.

2. Git repos tend to get rather large over time and yandex has been around since before git.

There is no git history in leak. Just a lot of code, test data and occasional binaries / ml train data and few pre-trained models.
Yandex tends to store third-party code in their repository. For example you can find sources of Spark here. So some part of 40GB+ code isn't written by Yandex.
frontend archive takes 18 gb, everything else is like 1-2 gig.