| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mortimerp9 936 days ago
	Hi, I work on seamless. What this refers to is added toxicity mitigation. We try to detect the level of toxicity in the input and make sure that the output toxicity level is not higher. This protects the model from doing egregious errors in the translation. There are more details in the paper if you want and the mitigation code is all open source if you want to check what it actually does.

5 comments

Reubend 936 days ago

That's an awesome feature. I think one of the worst possible outcomes of machine translation is something that ends up being accidentally offensive, and this is a smart way to mitigate that.

link

fl7305 936 days ago

> one of the worst possible outcomes of machine translation is something that ends up being accidentally offensive

The Hitchhiker's Guide To The Galaxy claims the opposite:

"Meanwhile, the poor Babel fish, by effectively removing all barriers to communication between different races and cultures, has caused more and bloodier wars than anything else in the history of creation."

link

SoftTalker 936 days ago

Or maybe we'll finally come around to the idea that being offended by words doesn't make a lot of sense.

link

hiatus 936 days ago

This will happen at the same time we stop being uplifted by words, or moved by them, or brought to tears by them, or fall in love over them.

link

madeofpalk 935 days ago

I'm sure you can understand why translating "I love you" to "I love you, bitch" is probably undesierable.

link

dontupvoteme 936 days ago

How do you account for colloquial (non-English) language which could be naively misconstrued as toxic?

e.g. "geil" (either cool or horny depending on usage) in German

It's not fundamentally different than e.g. "wicked" in English, but the biggest bias that potentially all these ML models exhibit is predisposition towards Anglophoneism

link

mortimerp9 936 days ago

Our goal is to have a good recall, sometimes to the detriment of precision, so for words with multiple meanings, it might consider them toxic when in the actual context they are used in, they are not. The toxicity mitigation algorithm will search for alternative translations that have the correct meaning but not the potentially toxic word so that there is no added toxicity in the output. This means that sometimes the model might prefer a less coloquial phrasing than what a human would.

You can find details on how the multi-language creation of the toxicity lists was done in section 7.3 of the NLLB paper: https://arxiv.org/pdf/2207.04672.pdf. TLDR: it's not just a translation of a base English list, even if we started from that, each language has a curated list that was built by professional translators.

link

dontupvoteme 936 days ago

That's significantly less myopic than I pessimistically assumed. Thanks!

link

novok 936 days ago

Is there an ability to turn it off? If you're translating an R rated movie with criminals who swear a lot, is it possible to get non-toxic filtered output to make sure it's being translated properly?

link

mortimerp9 936 days ago

it only kicks-in if the output is more "toxic" than the input. If the input has a lot of swear words and the output has the same amount, then it will be left alone.

link

thomastjeffery 936 days ago

What about the inverse?

Can it make sure that the output toxicity level is not lower than the input?

If not (which I strongly suspect is the case), then that is unacceptable. We cannot fight toxic narratives with ignorance.

link

Domenic_S 936 days ago

> What this refers to is added toxicity mitigation.

Oh, well that clears it up! </snark>

I don't see any definition of 'toxicity' on the landing page - it seems to be one of those 'I know it when I (hear) it' kind of words... unless there's some widely-accepted definition in this area of study?

link

mortimerp9 936 days ago

Sorry if I wasn't clear, internally we've been talking about it a lot, but I forgot that it doesn't have such a solid definition outside of our work. Thankfully, we try to define it in section 7.3 of the NLLB paper: https://arxiv.org/pdf/2207.04672.pdf

The tldr is that if you say: "Thank you for this job offer." you wouldn't want it to be (mis)translated as "Go F*k yourself.". But if you do say "Go F yourself", you still want it to be translated as that.

link