Hacker News new | ask | show | jobs
by disabled 1857 days ago
> trained on 75+ languages, can transfer knowledge between languages

There is zero possibility that Google accomplished proper "language transfer" with the vast majority of Silicon Valley programmers being native English speakers.

In some languages, if you accidentally use a wrong single syllable in any sentence, you can end up saying something extremely embarrassing--and entirely different. This is the case with many Slavic languages.

This is a memorable "classic" [1]:

> "Tony Henry belted out a version of the Croat[ian] [national] anthem before the 80,000 crowd, but made a blunder at the end. He should have sung 'Mila kuda si planina' (which roughly means 'You know my dear how we love your mountains'). But he instead sang 'Mila kura si planina' which can be interpreted as 'My dear, my penis is a mountain'."

Many languages are much more grammatically complex than English, and also have an unbelievable amount of implicit contextual information derived from the grammatical morphology. For example, Slavic languages tend to be this way. The Slavic language that I speak, Croatian, tends to be very clean, direct, and concise, while being extremely complicated grammatically. Also, we have a lot of the same words for the same thing in Croatian, which in combination with the complicated grammar, it makes it a very expressive language. English, however, can be more expressive, in the sense that it allows for more figurative language, like with the usage of idioms.

[1] BBC: Anthem gaffe 'lifted Croatia': http://news.bbc.co.uk/sport2/hi/football/7109058.stm

3 comments

Modern NLP architectures do not explicitly model language structure. Even in English, the model isn't directly told anything about about how words work. So the native language of the human authors of the model is (in principle) irrelevant to how effective the system is.
> There is zero possibility that Google accomplished proper "language transfer" with the vast majority of Silicon Valley programmers being native English speakers.

This speaks to ignorance of who Google employs. A ton of the engineers are immigrants there. When I was on Google Photos in MTV, I'd estimate it being about evenly split between native, English-first speakers, vs people who were either non-native English speakers or grew up with two languages simultaneously (children of first gen immigrants in the US).

Silicon Valley has a huge amount of cultural and ethnic diversity, so I don't know why you would make this mistake.

> There is zero possibility that Google accomplished proper "language transfer" with the vast majority of Silicon Valley programmers being native English speakers.

I don't know the people who worked at this project, but you do realise that Google employs swaths of programmers that are not native English speakers?