Hacker News new | ask | show | jobs
by pierrefar 3463 days ago
(Former Googler worked on this topic and now help businesses with this kind of question)

The short answer is focus on what is best for your users and what they expect, but there are a couple of things you need to do for best indexing in some cases.

To answer your last question first, if your server can respond to a URL and browsers fetch it successfully, Googlebot and indexing will be fine. Want to use UTF-8 for fully localized URLs? Go for it. Want to stick to ASCII? Sure. What's best for your users is the answer.

Trends across languages is a very odd request. Take the string of characters "chat". In English it's used in many different ways, a verb, a noun, etc. Same string in French means cat. You really wouldn't want to look at the trends of this string by combining English and French.

Now your biggest question: a search engine tries to identify the language of the pages in its index, and also the language the searcher is using. To illustrate:

1. The "chat" example is perfect for this, but also any number of queries.

2. Take someone searching from Switzerland. Wouldn't it be better to show them pages in their preferred language, be it German, French, or Italian?

3. Take someone looking for a specific business. If that business has pages for its UK, Australia, and USA subsidiaries, it would be great to show searchers the right country page if they're searching from the UK, Australia, or USA, even if all are using English queries and these localized pages are also all in English. For that, you'll need hreflang annotations, which also works across languages in the same country (Switzerland) or across languages (global pages in English and Spanish).

2 comments

> Trends across languages is a very odd request

Again I think I am failing to convey my request properly.

Take a look at this : https://www.google.co.in/trends/topcharts [0]. This is the list of top keywords being searched in India. And they are all in english!

I would like to see a similar list for keywords that were actually typed out in hindi (or any other language I choose to target). In essence, keyword trends in hindi. This isn't so much about search volumes/keyword analysis/other-marketing-terms as it is about simply trying to get to know the target audience and their demands.

With the internet so dominated by english, most people who frequently use the internet have picked up enough english to get their jobs done. A lot of them search hindi terms in english and are decently satisfied with results (or else they ask their neighbors/relatives for help).

So then, who are the users still looking up terms using hindi and what are the things they are searching for? Content discovery, at least for hindi, is a big pain in the ass and most of the regional content is click-bait/spammy [barring maybe the regional newspaper websites].

So for anyone trying to address this pain, knowing the current demand will help them target this niche and provide it better service. It will

a) Encourage these users to keep using their native language and not switch to english. Thus, increasing demand for more native content.

b) Encourage others to join the internet and look things up in their native language. Even the ones who are currently using broken english might choose to switch to the language they are more comfortable with

c) Allow the publishers to grow along with the crowd and keep improving their services. Further expanding markets.

0 - Even though I am taking the example of India/hindi here, I believe the situation would be true for a lot of other country/language combinations. Please correct me if I am wrong here.

Append hl=hi query string parameter to the page to get it in Hindi (hi is the language code for Hindi). Example:

https://www.google.com/trends/home/all/IN?hl=hi

This page is simply translating the earlier page to hindi, am I right?

Is there anyway to get the (top/currently trending) search terms typed out in hindi?

Eg: See [0]. While Flipkart is an often searched term on google, फ्लीपकारट (same word in hindi), is hardly ever searched. Similary, search frequencies of नौकरी, naukri and job are also very different (all three mean job, first one pure hindi, second is phonetically same as the first and can be termed as hinglish).

Similary, the english terms currently trending in india will be very different from the ones that were looked up in Hindi. And AFAIK, there's no way to see popular keywords by adding a specific filter on the input language used.

0 - https://www.google.com/trends/explore?geo=IN&q=%E0%A4%AB%E0%...

Idk, India is a weird example because the country is largely bilingual
The Norwegian version of that page shows Norwegian queries, so it seems that Google has localized editions.

https://www.google.no/trends/topcharts

Nope, it is much worse than that. Google will almost never let you see the locale you choose, but what they think you need.

When I visit your link I get a Norwegian page showing me trends for Danish stuff "top Danske mænd", "Danske politikere".

Google is terrible in this regard. It is impossible to look for things outside the sphere Google assigns you.

You generally can't link to a google site and expect the viewer to see content similar to you.

Do it with a private session. No cookie. No bubble.
It helps, but I'm still in my ip bubble.
If I have an online store in multiple languages, how does the spider index the different languages? Users can change languages by toggling them in a dropdown in the UI.