Hacker News new | ask | show | jobs
by Svoka 409 days ago
Thanks!

I wonder where this discrepancy comes from

2 comments

probably under-indexing of non-english sources by these crawlers.

would be interesting if yandex opened some data sets!

And lots of people write on the web using English as a second language, which both reduces the presence of their native language and increases the presence of English.
yep not a native english speaker here and yet my online footprint is mostly english due to software pushing me to learn it
My guess is that reference counting at depth=1 only capture non-$LANG content which text parts don't matter a lot, e.g. photo galleries.