|
|
|
|
|
by adamcanady
3326 days ago
|
|
I work on Google Search. Having one search engine per country doesn't seem like the correct approach to the problem. > Every country has a mammoth collection of valid results for your query. Having seen the corpus of content available from each language, this is categorically false. Consider Wikipedia, which is a fairly ubiquitous information source on the web that provides answers to tons of searches. English documents: 5.4M, Romanian documents: 376k. Perhaps the solution to the OP's woes is more tools for filtering. This post conflates the ideas between language and country. Search typically returns to you results and search features that are in the query language. Today's solution where filtering is done through query refinement and query operators seems to cover a lot of use cases already. Further, by having an integrated product, ML models can learn behaviors specific to certain locales where those behaviors differ from region to region, and balance with more universal behaviors that apply to more than one region. |
|
Or to put it another way, 376k Wikipedia documents in Romanian is about 50% more than the number of articles in the last print version of Encyclopedia Britannica. The dismissal of their significance may express a worldview bubble that is endemic of Google and its business model.
Just because 376k Romanian documents is not enough to train a data center in how to sell chia pets to Bucharestians, doesn't mean that it is not a significant repository of information for actual human beings.