Hacker News new | ask | show | jobs
by not_a_terrorist 2885 days ago
For quality translation of complex, non repetitive ideas, a human is, and will always be required.

I have been in the translation business field for about 20 years now, so I have seen the rise of desktop tools, then server-based solutions, and lastly, of course, cloud-based tools. While each generation is better than the previous, the increments are decreasing.

We are based in Canada, where we receive around 300,000 new immigrants per year. That's 6 million over those 20 years. Guess what? Those people are hitting the job market, and now, their children too.

Consequence: the overall quality of English texts is noticably decreasing, from ALL of our customers. I notice the names, and they are not your typical English, French, German, Italian or Ukrainian names (traditional settler groups in Canada). We do more and more of what is called "transcreation", a fancy way to say complete re-interpretation of a text. We basically extract the main ideas, and re-write the whole thing, because basing the translation on the provided source text always yield disappointing results.

I really don't see how machines could distill the ideas out of a text and reinterpret them in a nice way.

I can see AI work for short, simple, well-written texts. Heck, even long manuals with repetitive blocks of text from manual to manual. But with creative, complex text: still a looooong way baby!

14 comments

Technical solutions are only a workaround, not a proper solution. A solution to a problem that have little benefit of existing anyway.

I think it's about time humanity decides to stop the ego trip and declare english as the earth official language.

No need to kill other languages, every people are free to use them as much as they want.

But really, all __new__ documents, medias, displays, pieces of information, should be translated to english as well. And it should be mandatory at school, as well as to get any administrative position.

I'm french. My country has a VERY strong view on the local language protection. But promoting your culture doesn't have to be in contradiction with reuniting humanity.

Peace, democracy, exchange, cooperation, archiving, education: they are too hard to do in hundreds of languages. It's a waste of resources, and a hindrance to the most important challenges of humanity.

Esperanto never won, written chinese is way too complicated and english is already everywhere.

Before trying to share the same money, or abolish borders, live in harmony or reach any ideal at all, we gotta take the big rocks off the road. Not being able to understand your neighbor is a terrible curse for our specie, and easier to solve than war or famine. Actually it could be part of the solution to them.

And since those things take a long time, we better start now.

While having a trade language is quite a handy thing, for the reasons you enumerate, if you believe even a little bit in some degree of Linguistic Relativity, then there are significant consequences to reducing the general discourse, and the general thought-model, to a single homogeneous standard. It's not far off from attempting to solve race relationships (and gosh I'm not trying to be flippantly controversial here, I really do promise) by asserting that everyone should just be white. Yeah it'll address a lot of problems and reduce a lot of friction, but diversity does seem to be a strength of humanity's, not a weakness. Even if it does mean that the day to day is never as smooth as it could be if we were all just a little more the same.
I'm not worrying about killing diversity.

Languages are not static, english would evolve local features, variations, incorporate things from other languages, and adapt to specialized needs and creativity.

Some alternative languages would survive anyway. Some new may develop.

How do you think we got those many languages in the first place ?

The important thing is that all people have a root base they can use to communicate. And official one that is formalized, used systematically and taught globally. Not that we seek to eradicate what's goes beyond the root.

The important thing is that all can fill the same administrative forms, understand each others laws, debate ideas with equal strength, share books, consume remote news, discover other countries, etc

I agree with you, but I think this makes more sense to someone in Europe where being multilingual is the norm. In the US, a lot of people only speak English which leads to at least two objections to this idea. First, and this is the objection I grew up with, many people see it as unfair to require everyone in the world to speak "our" language. And second, this is the one that concerns me more these days, is that being monolingual makes it harder to understand your neighbors even if they've learned the same language as you as a second language. In the US, we're extremely intolerant of people who have varying dialects or accents in English. In addition, if you don't speak a second language it's hard to be aware of which figures of speech and other idioms you may be using that make your own communication less clear.

Ultimately, I think we'll end up with English being the common language of humanity. We're already more than half way to that point in terms of geographical regions where one can communicate using English. And though it's not a perfect language, it's pretty well suited to wide scale usage since it's got a large vocabulary and relatively simple grammar. But I don't think having English as a universal language will be without problems.

> First, and this is the objection I grew up with, many people see it as unfair to require everyone in the world to speak "our" language.

Yes, it's unfair. But we have more important problems. If humanity ends up picking up chinese I'll spend 10 years learning it. I don't care. A common language trumps those issues.

> . And second, this is the one that concerns me more these days, is that being monolingual makes it harder to understand your neighbors even if they've learned the same language as you as a second language.

You mean harder than the current situation where most people can't understand each others at all ?

> Ultimately, I think we'll end up with English being the common language of humanity.

If we don't make it so it's not going to be certain.

It's also too much of a slow process, and not enough of a formal one.

All those debates, all those arguments always ignore the big picture for some local smaller issues. Sometime you gotta bite the bullet and move on.

I'd say pick anything, I really couldn't care less. Klingon if you want.

But english is just the most likely candidate to succeed, being used in business, diplomacy and tech.

What do you mean by "reuniting"? Was humanity ever more connected than it is now?
Réunir in French means to unite multiple groups. It's different from the English reunite which means unite again.
Thanks.

Even when speaking the same language, understanding each other can be hard. One more reason for a better english support in the human operating system.

before Babel, perhaps
Ironically the only thing complicated about Chinese is its tonal pronunciations and writing system. It's grammar and vocabulary are markedly easier than English.

Additionally, English has some horrific consonant/vowel clusters and minimum pairs.

The alphabet is a phenomenal invention. I mean the alphabet in the large sense, be it the Latin, Greek, Russian alphabet or any alphabet, abjad, abudiga or syllabary. The Chinese writing system is a notable exception (along with Japanese Kanji and some others). The only complicated thing about Chinese is pronunciation and writing system. So half of it is complicated. If a language can be characterized at least by phonology, writing system, grammar and vocabulary. Then Chinese is difficult. Not that any language is easy. There will never be an agreement for the whole world to speak Chinese.
The alphabet, and punctuation.

Omitting punctuation, or using some meta language for it (e.g: in thai, repeatiting a word can mean "!") makes reading some text extra difficult.

I'm struggling to think of minimal pairs that are particularly difficult. Maybe those involving liquids, for East Asian learners?

French's minimal pairs involving [u] vs. [y] are terrible. Beaucoup vs. beau cul, au dessous vs. au dessus, etc.

French, all in all, is a terrible language for ease of communication.

Don't get me wrong. It's a great language to write a novel in. Make an argument. Describe a country side.

But it's slow, convoluted and error prone.

All languages have weird stuff. Even English, while being incredibly easy, have many quirks, like in the joke: "Yes, English can be weird. It can be understood through tough thorough thought, though.

But french is more about learning exceptions among a few rules.

Agreed but you can't change any of this.
> I'm french. My country has a VERY strong view on the local language protection. [...] Peace, democracy, exchange, cooperation, archiving, education: they are too hard to do in hundreds of languages.

I can't find where I originally read this, but wasn't the French obsession with standardizing their language largely a response to the pre-Revolutionary period, when France was a giant patchwork of divergent dialects and you'd be hard-pressed to understand someone from the next village over?

Yes, which funnily made Richelieu take measure to federate them all...
> I think it's about time humanity decides to stop the ego trip and declare english as the earth official language.

Will the explanation be English? This wouldn't work of course. Would it be in a local language? That would be a self defeating paradox. Not constructive at all.

Edit: I mean language has to be self referential. And not just understandable, but understanding. How about you learn Spanish, Japanese or Arabic ... All of them for inclusions sake.

This never will happen. People will never agree to upending their culture and language, even in supposed interests of humanity. [Whether this would actually work is arguable].

> You can preserve culture and language, while simultaneously forcing everyone to learn English.

Nope, because culture and language are deeply intertwined. Over time, people would use their native languages less and less, and then entire cultural swathes of knowledge will be lost.

Next, no one will ever agree on one language. Not English, not Chinese, nor any made up language. Especially not an existing real language, for any number of reasons.

There are also concepts in different languages that are difficult to translate or grasp in other languages. Translation isn't a 1:1 rote task.

> I think it's about time humanity decides to stop the ego trip and declare english as the earth official language.

There is no "ego trip" going on here. The only "ego trip" is assuming that we can simply force everyone, unilaterally, to speak X language.

> Peace, democracy, exchange, cooperation, archiving, education: they are too hard to do in hundreds of languages. It's a waste of resources, and a hindrance to the most important challenges of humanity.

[Citation needed that this is better than forcing 1 language on humanity, which will almost certainly only happen with supreme military force, aka wars.]

> written chinese is way too complicated

Subjective.

---

Translation is a problem that we have to deal with, but it's better than trying to force one language.. People and societies cannot be engineered with a hand-wavy solution of "oh, just 1 universal language".

This is a very stereotypical hacker news viewpoint of blithely trying to "engineer" life and humanity, as if it were so simple.

> This never will happen. People will never agree to upending their culture and language, even in supposed interests of humanity. [Whether this would actually work is arguable].

This worked for one country, so there are chances it works at a bigger scale.

However, I fail to see arguments that are serious enough to back up your "never", which is a pretty big word to avoid argumenting for somebody using "citation needed".

> Nope, because culture and language are deeply intertwined. Over time, people would use their native languages less and less, and then entire cultural swathes of knowledge will be lost.

That's not killing, that's letting die. Do you regret latin ? We are doing alright without it. But we can still read it if we need to.

The difference between killing and letting die is that people will stop using their languages after decades without feeling to be robbed of it, because they still could use it.

> Next, no one will ever agree on one language. Not English, not Chinese, nor any made up language. Especially not an existing real language, for any number of reasons.

Again with the huge, absolute assertions, without backup.

> There are also concepts in different languages that are difficult to translate or grasp in other languages. Translation isn't a 1:1 rote task.

Yes. Things are imperfect. We will loose in the process. Guess what is also imperfect ? Communicating at the scale of 7 billion people with different culture, believes and needs.

> There is no "ego trip" going on here. The only "ego trip" is assuming that we can simply force everyone, unilaterally, to speak X language.

Absolute sentences and lack of arguments are usually sourced in a strong emotional reactions more than logic. So my guess is there is some ego there.

>> Peace, democracy, exchange, cooperation, archiving, education: they are too hard to do in hundreds of languages. It's a waste of resources, and a hindrance to the most important challenges of humanity.

> [Citation needed that this is better than forcing 1 language on humanity, which will almost certainly only happen with supreme military force, aka wars.]

Well, take 5 people speaking 2 languages each, but only one in common with another one. And 5 speaking the same language. Put them in a room and make work on project. Check which team accomplish the fastest the task at end.

>> written chinese is way too complicated

> Subjective.

The fact it takes 5 years to a chinese to be able to write english and 10 for an english to learn chinese is not subjective. Again, that's funny comming from somebody who is all emotional about this.

And I get it. I get that languages are an emotionally charged topic. But incomprehension is a problem hard enough when we speak the same language: see this very thread.

> This is a very stereotypical hacker news viewpoint of blithely trying to "engineer" life and humanity, as if it were so simple.

I don't know where you got from me that it was simple. Also, thinking I'm talking about engineering and not politics and sociology "is a very stereotypical hacker news viewpoint".

What citations do you want me to provide? This is purely theoretical discussion. Do I need to cite that after millennia, we still have different languages?

Do I really to dig up some academic paper to acknowledge that humans find it hard to agree on standards? That doesn't even include the geopolitical implications of this- as if China would ever agree to make English the One Language that all government and business runs on, etc.

If you make a theoretical conjecture, I don't need to provide academical papers to provide a rebuttal. Please, treat academical papers with rigor, not as a fallback for when someone challenges you on, again, a theoretical conjecture. I also don't need to provide papers for basic human intercourse.

> This worked for one country, so there are chances it works at a bigger scale.

No, you cannot extrapolate based on one country. Human beings are irrational and proud. Again, look at it from a geopolitical view.

> Yes. Things are imperfect. We will loose in the process. Guess what is also imperfect ? Communicating at the scale of 7 billion people with different culture, believes and needs.

Yes, different cultures, beliefs, and needs. All of which would be lost by -unilaterally- forcing one language, since reaching agreement won't happen. Nations are still figuring out how to solve their own issues, so why should a Korean person care to be forced to learn some random language? That already happened when Japan occupied Korea and forced Koreans to learn Japanese- why don't you read some history and tell me exactly how much Koreans liked that. [This also goes back to my previous statement about military domination being the only real way of forcing a language change.]

> The fact it takes 5 years to a chinese to be able to write english and 10 for an english to learn chinese is not subjective.

[Citation needed again]. Of course English writing is easier to learn, it has a phonetic alphabet... However Chinese has much more simplified grammar than English. There is no subjectively "better" language, unless you specifically mean in 1 single aspect, maybe. But languages don't exist in vacuums, so this point is moot. (5/10 years is way off, also. This is anecdotal evidence as well, and years vary by each individual person.)

Discussing the merits of Chinese or any other language is really another discussion, but Chinese people do just fine.

> Again, that's funny comming from somebody who is all emotional about this.

No, this is coming from somebody responding to a shortsighted conjecture.

> Also, thinking I'm talking about engineering and not politics and sociology "is a very stereotypical hacker news viewpoint".

No, I don't think that you're talking about engineering. I'm specifically pointing out that you are treating a human and cultural issue from an engineering perspective, as if it's merely something that can be "fixed". It's a myopic viewpoint, because that's simply not how humans work.

Also in the industry and I have a few counterpoints.

1) That the general level of Canadian English is reducing due to immigrants is ludicrous. You have a significant burden of proof for statements like that.

2) Transcreation is usually understood to mean the translation of creative/marketing texts where cultural and local norms are taken into account (think slogans, taglines etc.). It does not typically mean that you re-write the source text, adding new ideas and concepts. All you are doing in that case is fixing a bad source text. Sometimes we get blamed for translating a bad source accurately, so it's a fine line. But it isn't transcreation.

3) "Creative, complex text" does not make up the majority of a commercial translation operation's workload. Literary translation is a fraction of a fraction of the global translation demand. AI/neural net machine translation works for many, many commercial texts. I'd go so far as to say it works for the majority in the major language pairs.

Essentially it is recombining known good, reliable human translations in grammatically sound ways, even using idiom correctly where appropriate. The quality of the output depends on the quality of the input, and as such machine translations can be limited by the corpus it draws from. For example, the swathes of the freely available legal or regulatory texts it receives from the EU. These texts can use 'globalised language' at times, avoiding anything tricky to translate in the other languages that they are to be presented in, but that edges into a side issue of the globalisation of English/languages in general.

Overall, I hear a lot of this from the industry, "oh, it'll never work well enough to replace my job", but that's how you get blindsided. Accepting that it works well for major language pairs at this point is a good first step to accepting a changing industry.

That the industry is growing, and fast, may be down to any and all of a) increased awareness of translation's ROI, b) increased awareness of translation itself through the ever-present dream of ubiquitous, perfect babelfish-like translation and c) globalisation maturing into its final form. There's plenty of work for translators and LSPs for a good while yet, but spinning the line about how bad MT is does not help further the cause.

1) It's my impression, friend, I am not trying to propose a "New theory of bad English". I tell it like I experience it. We have customers in Vancouver and in some suburbs of Toronto from which some of the source texts are quite "interesting" to decipher (since you like political corectness, I will leave it at that). That type of text was no existent 20 years ago.

Maybe I am simply linving in a giant selection bias bubble, who knows. And by the way, I am not saying that multi-generation Canadian citizens are better at English writing. Please let's not start a discussion about generalising group attributes, it never ends well, and you know it.

2) Well, I did not want to bore our readers, but your description of transcreation is excellent! Bottom line: we do much, much more text reinterpretation and transcreation these days (and our in-country customers love it!). Again, maybe it's just a selection bias bubble, simply a refelction of the direction our business has evolved.

3) Again, you are right. I will even go farther: the majority of customers do not give a rat's ass about text quality! However, I am in a business where quality and precision of the info is paramount, and AI, with its "best fit" or "most probable" translation, does not cut it.

4) We are on the same page, I also can see several use cases for AI translation. Your examples are excellent.

In conclusion: I a translator who also happen to hold 3 STEM degrees/diploma, so you can be sure I jumped on the AI bandwagon, and I keep following what's new from up close. But as far as I can see, it can only by some kind of tool for certain use cases. It's almost as if it was going to be a separate field of its own. yo!

I'm curious as to why automated translation seems to be so bad for Japanese<->English.

Is Japanese really such an outlier?

English <-> Chinese seems more reasonable. Translation between European languages I would guess is an almost solved problem, but how about between language groups in general?

It could be cultural, in that Japanese communication style can often be understated, with indirect implications pointing at a meaning without directly specifying it. That can vary greatly depending on the context and subject matter, though.
Ouch you tell me!! A couple of our customers use high tech products manufactured in China and Japan, for which the manuals were badly translated from the native language to English, and then we have to mop it up from English to French, for example. It is very labor intensive to decipher what the intent is and then to translate it into direct French sentences, in the style most readers are expecting. Thanks for pointing it out! Again, I doubt a machine could do that kind of work.
As a general rule it probably depends on what you're trying to translate. If it matches the corpus well, you'll do better. This is simplified, of course, as NMT is a bit of a black-box. But we know the GIGO concept still applies.

Perhaps Japanese has a smaller corpus to draw from. Perhaps less work has been done in that pair. Perhaps you're just not translating the right kind of text. It's hard to say.

Comparing Japanese to Chinese is not really apples to apples. Likewise for European languages. Sharing a writing system in any form (Latin/Roman or Chinese) doesn't really come into it.

> I really don't see how machines could distill the ideas out of a text and reinterpret them in a nice way.

Well, machines are already doing it with paintings: https://arxiv.org/pdf/1508.06576v2.pdf. There's no technical reason for not being able to achieve this sort of 'machine creativity' with texts. It's just a harder version of the same problem.

> I can see AI work for short, simple, well-written texts. Heck, even long manuals with repetitive blocks of text from manual to manual. But with creative, complex text: still a looooong way baby!

I (carefully, waiting for experts' feedback) disagree. ML works on data and algorithms: give a deep RNN a great lot of human made translations, time/cpu to produce a good model, and I think you would be astonished by the results.

Data's complexity is of little hindrance here, because the algorithm does not try to understands the data (as we humans like to think we're doing when reading), it's just trying to infer a solution based on learned behaviors. And it's 'just' ML, AI will come when your computer tell you that this specific text makes it sad, for example.

I think "good model" is the hard part here and very far from being achieved.
> While each generation is better than the previous, the increments are decreasing.

Diminishing returns. Getting 90% of the way there is easier than getting that last 10%. Because once you take care of all of the easy problems like simple word substitution and some of the more troublesome problems like basic syntax, you start finding yourself facing the really hard problems. Stuff like idioms or words with multiple connotations in one language but lack them in another language.

And also there's the issue is that we currently don't have the skill to articulate what's missing. As a species. We haven't advanced our knowledge of our own thought processes to explain exactly how a certain bit of translated text just doesn't work. We know it doesn't, we know it should be like "this" instead, but we don't actually know why in a way we can describe to others.

But I think that's one of the things about AI tools, by developing them we get to learn about ourselves.

I remember when Google announced their recent updated translation engine and Swedish and Finnish translations got significantly worse despite the marketing messages. I don't see progress, rather regress, in the computer translation area at all but services with human translators are becoming more usable thanks to the Internet and I think that area will see progress in the future.
Speaking of Finnish. Finnish and Estonian both I think have the problem that can't even be solved, they don't have gendered pronouns but proper English sentences require that information, which means that translations to English are weird-sounding to say the least.
Yes Finnish seems to be especially hard to translate, Google and Bing often produce English that is close to gibberish
I wonder if this is getting easier as certain genres of English writing transition to using non-gendered pronouns.
The problem can sometimes be solved by inferring pronoun gender from the surrounding context.
> I have seen the rise of desktop tools, then server-based solutions, and lastly, of course, cloud-based tools. While each generation is better than the previous, the increments are decreasing.

Could you share what tools you are currently using?

I think OP is referring to the widely used CAT (computer-aided translation) tools that save every sentence you translate into a translation memory (XML file, known as TMX, usually) to be auto-inserted when those sentences (segments) re-occur. It can be a huge timesaver. Sometimes identical segments need different translations, so they still need checking.

Most of these CAT tools integrate machine translation at some level, either through their own engines or APIs. This is traditionally to save the translator time on the simple segments (numbers and their formatting, lists of countries, place names etc.) but can also be good for avoiding multiple dictionary lookups in unfamiliar fields. Obviously professional translators should avoid working in fields they are unfamiliar with, but there are always new terms and technologies to contend with.

https://en.wikipedia.org/wiki/Computer-assisted_translation

As for the incremental improvements decreasing, that's a tricky statement to make with AI/MT creeping ever more into the software translators use daily. Might be more representative of OP not exploring all new features in the software they use.

Excellent answer, except your last paragraph!

I have used pretty much any and all cutting edge tools on the market. If anything in some instances, it creates even more proofreading/reviewing work.

That being said, as I explained to another commenter above, I foresee several useful use-cases for MT (and there are already are some!). I see them as mostly a separate field from human translation work, in parallel, with little overlap.

Thx for the pointer to TMX. I need to do some computer-aided translation, and was about to roll my own file format for parallel-strings, but always better to choose something existing. Seems there is quite a bit of tools that support it too...
Translation comes in a quality spectrum. On one end, you have simple semantic/word translators. As long as the original text is written simply, a user will be able to understand >50% of the meaning, whatever can be understood without sytax.

On the other end of the spectrum, translation is rewriting while trying to maintain the original tone, intent and other subtle aspects of meaning.

This is essentially writing and authorship, and not far from a Turing test.

In between, there's a lot of usefulness. Its very feasible to communicate via IM and a free translation service. It's possible to headline-read a paper.

I agree with your overall comment, but could you elaborate on the link between

> [Immigrants] are hitting the job market, and now, their children too.

and

> the overall quality of English texts is noticably decreasing, from ALL of our customers

?

I am not trying to start a culture war or to pity the general English skills of recent immigrants and their family, far from it. They do their best, and we can only congratulate them for their efforts. I am not here to spew out racist generalisations, let's all stay very calm.

What I am saying is that we are seeing more and more locally produced Engrish, Chinaglish, RussianGlish, Penjabi-glish or who-knows-what-glish! And when I look at the name of the submitter of those texts, it's usually a typical recent immigrant name. But OF COURSE, we have Wilson's and Taylor's who write in horrendous English too.

I also am not saying all immigrants are idiots and cannot write in English, some are truly excellent, just as much as your typical "native".

What I am saying, is that with volume, that very occurrence of bad English is increasing. That's it. I can see how very unorthodox English style could be a challenge for AI.

> hitting the job market, and now, their children too.

The immigrants are in play.

> the overall quality of English texts is noticably decreasing

and they are changing the landscape beyond just economy.

> the overall quality of English texts is noticably decreasing, from ALL of our customers.

Is it possible to qualitatively measure the quality of a text?

Yes, there are several grading systems. We don't use them. However, when I have to re-read a simple sentence or paragraph 4-5 times, and perharps read the whole text or even search about a product in order to understand what they are trying to say, then it's a pretty clear indication the text is muddy!
Yeah, I think a pretty good job can be done with sppropriate criteria. This is essentially what grading systems try to achieve - I think the IB program has wrestled a lot with this idea.
For quality translation of complex, non repetitive ideas, a human is, and will always be required.

Bookmarked. Let's check back in twenty years.

Because - as a linguist, not a translator - I strongly disagree.

The quality of written text could get worse by the general population reading a lot of misspelled chat, sms and text messages instead of reading correct English from proof read quality sources such as books.

My understanding is that some translators are using machine translation for bulk translation and then that human professional translations use that as a base for translating.

>For quality translation of complex, non repetitive ideas, a human is, and will always be required.

This is almost certainly not true.

What kind of document benefits from "transcreation"?
I used to work for a Language Service Provider (LSP). Typically it's used in marketing material, because a directly-translated slogan or message might might not 'work' in another language or culture. It might just sound bad or clunky, or it could even be offensive in certain contexts.

As others have said, the resulting text should convey the same meaning in the same tone, but needn't be word-for-word correct.

A good example of this going wrong is described here, with the "spunk" debacle: http://www.todayifoundout.com/index.php/2016/05/time-coca-co...
I've also translated and yeah, it means going from scratch and becoming not only the translator but the writer of the text. You show the client, "is this what you meant?" And THEN you translate THAT.
I assume texts created by desktop tools, then server-based solutions, and lastly, of course, cloud-based tools.
Arguably impossible for AI, since sufficiently complex language can be Turing complete, and thus general understanding requires solving the halting problem.
There’s something in the human brain that can’t be replicated?
No, there is no reason to think that the human brain is capable of computing anything that isn't Church-Turing computable. Luckily no translation actually requires a "general solution" of the problem formally.
How do you know this is true?
Texts have finite length.
Can you expand the missing premise?

  1. Texts have finite length.
  2. ____
  3. Therefore, the halting problem does not impact machine translation.
Maybe there are examples of texts that can't be translated even by humans? Language Log has discussions about so-called words that can't be translated regularly and generally rejects that concept, but maybe whole texts are more difficult. The I-Ching has several translations that don't resemble each other very much and it's common knowledge that a translation can be either accurate or beautiful but rarely both.
If the human mind is non physical, then it is might be a halting oracle.
If the moon is made of cheese, then 1=2. What does non-physical even mean?
It is hard to say what the non-physical is. However, it can be detected. For example, if we could, say, solve the halting problem, then whatever part of us is doing the solving cannot be physical, since everything physical can be computed by a Turing machine.

So, in general the non-physical is defined negatively. The physical is defined by physical limitations, i.e. the laws of physics and computation, and thus anything that surpasses those limitations is non-physical.

There is no algorithm for checking whether a system can solve the halting problem or not, the reduction to the halting problem is quite simple. But that's besides the point.

If you found something that could solve the halting problem and you're lucky enough to be able to prove it for this special case, that would just mean that the assumption that everything physical can be computed by a Turing machine is wrong. It works the same way with anything else that we think of as "impossible". If you observed an apple falling towards the sky that doesn't mean that the apple is non-physical, it just points to a serious flaw in our understanding of gravity. Experiments trump all physical theories.

That makes the term 'physical' vacuous.

Technically, we already recognize the world is non-physical. Materialists defined materialism to things bumping into each other. Now we know there are things like field effects, attraction, and action at a distance. So there is a non-bumping substrate that the bumping things exist within. Thus, strict physicalism is already known to be false.

When we will have created human-level AI, changes in language translation will likely be a minor detail on the background of other changes in the world.
Your question sounds like you assume we have a perfect understanding of the brain. At best, our understanding is still very much peripheral. We have no idea how consciousness forms, even if we roughly know where it is located.
No, I didn’t assume we have a perfect understanding of the human brain, simply that nothing magical is happening and that we will eventually be able to better understand it.
Why don't you think there is something magical happening in the brain?
If solving the halting problem were necessary to understand language, humans wouldn't be able to understand language either.

I think that the confusion here is that you're conflating "comprehension" with "execution". You can write out a set of instructions in English that describe a non-halting algorithm, but I don't have to determine whether the program you've described halts to understand what the algorithm does. And I certainly wouldn't need to determine that in order to translate the instructions into Spanish.

You assume the human mind is only as powerful as a Turing machine, which is just conjecture.

In general, understanding what a program does requires knowing whether it halts. For example, what does this program do when given input another_program?

  if halts(another_program):
     print "program halts."
  else:
     print "program does not halt."
>You assume the human mind is only as powerful as a Turing machine, which is just conjecture.

Given the lack of any proposals for how, even in theory, hypercomputation might be achieved in a physical system, and the lack of any empirical indication that humans are capable of hypercomputation, I would say it's much more than a conjecture. At the very least, it's an incredibly likely conjecture.

> For example, what does this program do when given input another_program?

It prints "program halts." if another_program halts, and prints "program does not halt." otherwise. I understand exactly what this program does, even though I can't tell you its output for an arbitrary input. And if I want to translate the above sentence into Spanish, I don't have to come anywhere near solving the halting problem to do so.

Consider the following program which does not involve the halting problem:

    function (targetDigest)
      i = new BigInt(0);
      while(true) {
        if (sha1(i.toBytes()) == targetDigest) {
          print(i.toBinary());
        }
        i = i.plus(1);
      }
    }
What does this program do when given targetDigest? It prints out the shortest bit string whose SHA-1 is that targetDigest. I obviously can't tell you what specific answer it will output for a given input, but it would be absurd to say that I don't understand the program.

But even if you can come up with sentences that require one to solve the halting problem to translate (and I'm still not convinced such sentences exist), it's another thing entirely to claim that those sentences come up in practice. This is a red herring when it comes to the feasibility of machine translation, because a good translator doesn't need to be able to cope with every valid sentence. It merely needs to cope with the set of sentences that humans actually care about.

My point was given a specific another_program you cannot say exactly what the program will do. On the other hand, every program is some kind of Turing computation, so you could argue by knowing of a Turing machine you understand every program. However, that is not so useful.

If we knew everything that comes up in practice, translation and natural language processing would be a solved problem. Without this knowledge, then we need to look at the general picture, which is not solvable. So there is no automated way to understand all language, we have to always rely on a human in the loop to provide domain knowledge.

No, again, you’re conflating comprehension and execution. It is very clear that you don’t have to execute a set of instructions in order to translate them into another language.

So unless you can give a specific example of a sentence that requires one to solve the halting problem to translate, I don’t think you’re saying anything hint meaningful. And if you can come up with such an example which people would actually use in practice, I’ll eat my hat.