French is my mother tongue, but I've quickly learned during my studies that using English keywords in my STEM-related searches would simply lead me to better (and more abundant) results.
A few weeks/months ago however, while I was trying to solve an issue whith a colleague who would search using french keywords, I noticed that some websites featured on the first page of the Google results were off.
In short, they were machine-translated versions of Stack Overflow threads. And they would appear in most of the searches using french keywords.
Those websites also appeared rarely in my searches while I was using English keywords, but most of the time I never bothered opening them. But now I notice them every time.
Some examples:
When searching for "wget set http proxy" on Google, the fourth result leads me to qastack.fr, and the ninth to it-swarm-fr.com, both are websites featuring scrapped and machine-translated threads from Stack Overflow.
When searching deliberately in french for "Eclipse CDT stdout ne s'affiche pas" ("Eclipse CDT stdout not displayed [in console]"), the first result leads me to askcodez.com and the fourth one to qastack.fr (askodez is the same as the other two).
I have never stumbled upon Github clones, yet, however.
One huge help here is uBlacklist, which has filter lists for search engine results. Of course, the Chrome version will be crippled more as Google feels the knife in its revenue artery, so FF is advised!
I read that you can achieve similar results with uBlock Origin, but I indeed ended up using uBlacklist as you can block websites on the go and because it supports multiple search engines. It is working perfectly, one more reason to use FF I guess.
I don't have an example search, although I'll try to remember to update this comment the next time it happens. On average I come across these things at least once a day, but it depends what I'm working on. It tends to be when searching for more obscure bugs, for which there is a GitHub issue but it's not ranked highly on Google for whatever reason, but these spam sites are ranked highly.
GitMemory is probably the most well-known example; it's just a thin layer over the GitHub API with a completely garbage UI, yet it often ranks higher than GitHub itself.
A few weeks/months ago however, while I was trying to solve an issue whith a colleague who would search using french keywords, I noticed that some websites featured on the first page of the Google results were off.
In short, they were machine-translated versions of Stack Overflow threads. And they would appear in most of the searches using french keywords.
Those websites also appeared rarely in my searches while I was using English keywords, but most of the time I never bothered opening them. But now I notice them every time.
Some examples: When searching for "wget set http proxy" on Google, the fourth result leads me to qastack.fr, and the ninth to it-swarm-fr.com, both are websites featuring scrapped and machine-translated threads from Stack Overflow.
When searching deliberately in french for "Eclipse CDT stdout ne s'affiche pas" ("Eclipse CDT stdout not displayed [in console]"), the first result leads me to askcodez.com and the fourth one to qastack.fr (askodez is the same as the other two).
I have never stumbled upon Github clones, yet, however.