Hacker News new | ask | show | jobs
by ncallaway 1186 days ago
> Computers are sometimes likened to "magic" where we just cast a spell by writing code, and something technically beautiful happens, but where magic can interpret intent and act according to the caster's desire, computers can't, and there's no way to "Wrath of God" the spammers.

I somewhat agree with this, but I feel like there's some very low-hanging fruit that should be available.

When posting in comments on a video there should be:

- An avatar similarity check. If you have an avatar that is too similar to the channel of the video you're commenting on, your post automatically goes into a moderation queue (or just remove avatars from comments, except from the channel).

- A name similarity check. If you have a name that is too similar to the channel name, your post automatically goes into a moderation queue.

- A huge indication that a comment comes from the channel author. They have some indications of this now, but it should be very prominent, so it's *obvious* when comments don't come from the channel author.

None of these things are going to be trivial to implement for a company like YouTube, but this has been a problem for years at this point. These things could have been done by now.

4 comments

Formatting author comments to have red background and white text is, like, an intern project.
Abuse is a very difficult problem that Google spends a lot of money and talent on. There’s no silver bullet like avatar/name similarity or other forms of detection because the scammer can trivially iterate until they avoid automated detection.

I think the indicator that a comment comes from the author is pretty noticeable but it could be better. Its a tough trade off to make between UX and fighting abuse.

The reality is that there are criminal groups who spend great resources to scam people online and sometimes they figure out a clever way around enough mitigations so they can completely hose a platform. There’s not a lot a platform can do against this kind of attack except detection and reacting.

I see it as an international crime issue where certain countries are indifferent to americans or even their own citizens being scammed. This is a very different problem if Google could simply pass their info on the scammer over to a competent law enforcement agency. It would be a lot riskier for scammers, and they’d have to put a lot of effort into evading detection. Definitely a pipe dream though.

> There’s no silver bullet like avatar/name similarity or other forms of detection because the scammer can trivially iterate until they avoid automated detection.

That's a cop-out, though.

It's true that they're not going to be able to stop every single scammer, but that doesn't mean they can't raise the barrier to entry enough that a significant percentage of scammers find that it's no longer worth it.

As for the specific measures the GP suggested—those are absolutely things Google has the resources to do. Image and text similarity are things they deal with all the time, and if they can make it effectively impossible (barring occasional random false negatives) for scammers who attempt to impersonate the author using these methods to get through without a human double-checking, that would be a huge blow to their ability to fool people. It's not like if you're clever enough you can, say, have an avatar that shows one thing to the bot-check systems and another thing to users.

> There’s no silver bullet like avatar/name similarity or other forms of detection because the scammer can trivially iterate until they avoid automated detection.

I agree that there's no silver bullet, but there need to be a hundred small changes that each increase the amount of work or decrease the scammer conversion rate until their ROI is materially harmed.

Name similarity checks have a few benefits:

- Would be quite easy to implement - Increases the scammers' work which reduces their ROI - Reduces the conversion rate (because instead of a spam message coming from "Tolarian Community Collage" it now comes from "Tulrin Community Collge"; the more the spammer has to iterate on the name, the less believable it becomes)

Bonus points if you add to the moderation queue a "Rejected because this was an impersonation attempt", and now the avatar/name of that commenter goes onto the similarity detection checks for that channel.

I agree with you. However, I'd point out a usability issue:

> A huge indication that a comment comes from the channel author.

This pattern requires that users have done something to have seen this before. While many users will see it, the users that are probably falling for this problem probably will not have seen this "huge indication" before, so wouldn't know it exists.

e.g. a user who never checks comments really, but then checks comments one day and sees the scammer might not realize there would be an indicator if it was the video's author.

I agree, which is why I would prioritize the other fixes as well. Still, these kinds of things (once broadly learned by the community) can have a significant impact.

Yes, it won't be 100%, but it would reduce the conversion rate of the scammers. That's ultimately the solution. No single measure will ever fix this, and if you wait to implement something until you will solve the entire problem space you'll never get started.

The solution is a hundred different changes that each reduce the scammers' conversion rate by 1-2%.

There are definitely ways to combat this but I think they're expensive. An avatar comparability check at the scale of YouTube would be immense. Running each comment through some machine learning algo would be immense.
> An avatar comparability check at the scale of YouTube would be immense.

I don't think it has to be. For each user you should already have the hash data computed (using something like imagehash this works out to a hash of 8 bytes per image, though you can obviously tune this up depending on storage/performance requirements).

Each time a comment is posted, you would do a distance measure between the commenter's avatar hash and the channel avatar hash. Mixed in with the network latency and DB I/O operations, I think this additional read/write that only needs to occur when a comment is posted could be done with a pretty minimal additional compute overhead.

That said, if Google doesn't have the compute overhead to do it, I gave an alternative. Simply don't display avatars from commenters other than the channel author.