Hacker News new | ask | show | jobs
by shadowgovt 2216 days ago
"Who at Google decided to censor American comments on American videos hosted in America by an American platform that is already banned in China?"

Probably no individual. There are enough Chinese pro-nationalists using YouTube to generate noticeable signal if they all, independently based on their political creed or as an organized brigade, decide to start flagging posts. Once the flagging begins, the relative rarity of the characters in question combined against the flagging signal would generate a Bayesian prior that the word in question would tend to get flagged, and would preemptively start killing those comments.

This is one of the ways to train an automatic moderation system that is capable of discovering novel words the community decides are swears, and brigading is a known pathology that those systems are susceptible to.

4 comments

While I think this is exactly what's happening, it demonstrates that we need more human in the loop interaction verifying the decisions of the machine learning algorithms. I don't think anyone that works in the space would disagree that this method would be highly susceptible to trolling. I mean look what happened to Tay[0]. You have to have some mechanism where humans are checking on how the system is learning.

The big question is: was this a recent effort to flag these phrases or was it a gradual thing? If it is the former, I think it is easy to forgive Google as things move fast. If it is the latter I think it brings questions about fundamental methodologies.

I am being intentionally ambiguous about what is being classified because there are similar complaints about other subjects so I want to generalize.

[0] https://www.theverge.com/2016/3/24/11297050/tay-microsoft-ch...

> There are enough Chinese pro-nationalists using YouTube

And they aren’t any more nationalistic than Americans. Reporting this slogan is the equivalent of, say, an American reporting “Drumpf is Hitler” or “Hussein Obama”. These are all dumb slogans which are spammed just to get a political reaction. Different people get offended about different things, that’s just something you have to deal with in a big community.

If there are so many Chinese people using YouTube their signal would behave no differently than people from other nationalities, no?
It's not so much a function of how many Chinese people are using it as how many instances of the word being posted result in a comment being flagged.

Intersections such as "The word is rarely used, but when it is used it happens in a political setting where someone is more likely to decide to hit the flag button" would train an ML algorithm that the word is unwelcome in general.

Couldn't you just have bot accounts that search for YT comments and key phrases and flag those? Simply enough flags results in auto removal regardless of YT's decision's on the acceptability of these words/phrases. This wouldn't be very hard to setup either.
No, no it would not. ;)

One challenge is that Google's actually got some pretty solid signal to find and kill bots. But it's not impossible to botnet their services; just harder than doing it to the average online service that doesn't have an army of engineers who've trained on the adversarial space of people trying to automate ad clicks for real-money revenue.

This sounds like a reasonable conclusion to me. Anyone here work for YouTube that can confirm/deny this?