Good question. I am a Russian speaker - ran the same searches on google.ru and got much better results (than google.se). Were Google.ru's results better than Yandex? Probably not.
Perhaps the problem isn't quite as fundamental (non-English NLP sucks) but has more to do with secondary Locale handling in specific countries?
Interesting observation on US-centricity (is that a word?) nonetheless. Many major US Corps derive > 50% rev from outside of US. Not sure if GOOG is one of those.
You raise a very interesting point. Just because a lot of American companies draw a large amount of income from abroad does not make them respect the market or adjust the product to that market. IN fact, US imperialist policy (and that is simply what it is, don't take me for a rebel commie, but a country that has been at offensive war for most of it's history cannot be named anything but an imperialist) has insured that other countries are force fed American products, right after we liberize them, as in liberate for immediate colonisation.
Yeah, hardly scientific, but it's at least anecdotal evidence that there's some room for improvement. Like you I wonder how different the results are with google.ru though. I live in Ireland, but sometimes do queries in Dutch. Google.nl often gives different and much better results for those than google.ie.
Also slightly off-topic: I appreciate that English isn't the author's first language, but, wow, that was painful to read. I feel his point might come across a bit stronger without sentences like
"Or perhaps a painkiller should be administered to dispense the suffering of google as we strike with the mercykill?"
or
"Perhaps it is time for google to crash down on the matress of bitter disappointment as well, a la black swan."
We don't know what the settings of his computer are (the fact that the UI is in english makes me guessing he has the "hl=en" setting). He should add "&hl=ru" to the url to make sure he is searching the russian version of Google.
That's the point of the article- there is no such problem in germanic languages and many other European languages that the google algo have been tweaked for. Search on google in english and swedish, without any URL string shenannigans, get good results for both.
Why should it matter? Both english and swedish work well in the "swedish" google search box in firefox. That's the point- the Russian algo and redirection to it is simply not there yet.
Great article about language as the last man standing against globalism. If you ever have to manage the translation of a piece of software into a different language, you will find two kinds:
One in western chars where you can get a feeling about the translators work and one in different chars like russian, japanese, chinese, arabic where you don't get a ghost of idea what is written there.
For search engines this is much harder: if you don't employ a sufficient numbers of native speakers for these languages, you will never get sufficient results. All other employees can't help you out. Translation software is useless, if you don't know how to weight the possible results from the cultural canon.
A cyrillic reader expects to find the russian state library on top when looking for a lib, not the library of congress.
Gabe Newell had an interesting take on this. [1] He was mainly talking about piracy and why it was so prevalent in Russia, but his point was that American companies are generally pretty terrible at localization. The reason Russians pirated as much as they did was that the crackers made much better localized versions than did the game companies themselves.
!!!! Exactly!
if you don't employ a sufficient numbers of native speakers for these languages, you will never get sufficient results. All other employees can't help you out. Translation software is useless, if you don't know how to weight the possible results from the cultural canon.
Most of all, google cannot find information in my blog (it is in russion and it is Lifejournal one) for a long time while yandex still can (Yandex Blogs).
Google search is quite forgetful. After certain amount of time you just cannot find something you need.
Well, for Yandex, livejournal is as important as .gov domains for google.
Livejournal.com is very important for Russians on the internet and completely irrelevant for everyone else.
It's where american teenage goths write diaries about fat issues for us Yankees. In Russia, a lot of important people blog there, with thousands of readers and followers.
Well comparing a search for a russian word in the US version of Google vs the same search in Yandex, this are not unexpected results... He should compare to the russian version of Google.
> Europeans have invented PHP, Linux, Python, C++, Ruby on Rails...
Sure, but Rasmus Lerdorf, Linus Torvalds, Guido van Rossum, Bjarne Stroustrup and David Heinemeier Hansson all now live in the US which IMHO says a lot.
I think despite all the article's flaws, that point you mention is very valid- google is a very American product, with the good-enough-for-us-means-good-enough-for-the-rest-of-the world.
I can't help but think using the Swedish Google to search for Russian is hardly a perfect example of science here.
Why isn't he using google.ru? I have no idea if it would return different results, but his current article doesn't prove as much as he thinks it does.
(Although, Library of Congress? Seriously?)