Hacker News new | ask | show | jobs
by maksimum 2624 days ago
> Facebook has no clue what kind of stuff gets posted on their platform, and their AI isn't powerful enough to detect it.

Almost certainly Facebook's ML can identify the themes of almost all content on their platform with high accuracy. That's a pretty easy problem at their scale.

2 comments

What makes you say it's an easy problem? They were able to block 1.2 million copycat videos of the shooting, but 0.3 million still got through. YouTube also struggled to remove them and even went so far to disable uploads entirely for a period because their AI was insufficient. The difference is they just declined to release specific numbers so they took less flack.

When you literally have thousands of humans all determined to bypass these detection systems by modifying the video and disguising it in various ways some of them are inevitably going to slip through. Is that really an issue with the detection system, or is there something wrong with the people who repeatedly try and upload and share videos of mass shootings?

>is there something wrong with the people who repeatedly try and upload and share videos of mass shootings?

Yes. A detection system can be used to limit occurrences of this issue.

In this case, an issue with the detection system used on Facebook was demonstrated by a 20% failure rate.

Facebook (and Google and others) can easily identify themes in natural language both written and spoken. They're really good at those problems because they're used as inputs for ad sales.

Identifying themes in video is harder than natural language. But fundamentally it requires the same kind of ML tools as natural language, which these companies have already mastered. I think the bigger issue is that Facebook doesn't have a business purpose for understanding video as compelling as it has for understanding natural language. They also don't have a business purpose for building scalable content censoring workflows since there's no serious regulation.

I don't know how you came to these conclusions, but they're fundamentally incorrect. Video has been one of Facebook's fastest growing advertisement categories in recent years, particularly after their acquisition of Instagram. They absolutely have a business purpose for understanding video and if they didn't they wouldn't have invested so heavily in it with features like livestreaming. Furthermore, I don't know where the idea of government regulation as an incentive came from in this thread, but it's illogical and ridiculous. Facebook doesn't need an incentive to try and prevent objectionable content like mass murders from appearing on their platform because they're well-aware of the damage they can cause their brand. That's incentive enough.
Yes they can identify themes, but not nearly with enough accuracy to make policy enforcement decisions. IIRC the preemptive video blocks were based on fingerprinting previous video uploads. They're still a long way away from being able to automate policy enforcement dynamically.
Would sample size be an issue? An algorithm can recognize that a video is about someone discussing the latest Marvel movie because there are a lot of those types of videos.

But can an algorithm recognize that a video is of someone committing a real-life atrocity? Those are comparatively rare.

It’s clear you have very little understanding of what is feasible with our current state of knowledge.
Google pretty much undeniably has the best ML people in the world and I see nonstop hate at YouTube for their automated filters false-positives so I don't think it is an easy problem.
Why would AI be able to stop any of this? I doubt even humans could make the right decisions, why would computer be able to?

The usage of AI as a magic silver bullet always frustrates me.

It seems like the only way to profitably scale this type of service to millions to a billion users, so it ends up being the go-to technical excuse for the people that think our moral standards should budge for the technical feasibility of maintaining these insanely large social media setups, like it's some sort of inherent right to be able to do that and turn a profit.

In a perfect world, where everyone is liable for the material they host on their own servers, services with this many users wouldn't exist in the first place.