Hacker News new | ask | show | jobs
by ants_everywhere 953 days ago
This is true. Just from a technical perspective I think toxicity isn't an entirely intractable problem. The combination of moderator + tooling + AI should eventually get to acceptable levels where you can scale moderation acceptably on average for most users most of the time.

But I do think there will be a cat and mouse game as tools to evade moderation also get more advanced and are perhaps only revealed when the moderation tools are needed most. That's where it's nice to have the resources of a large corporation to invest in being able to be proactive about threats.

1 comments

Agreed. Though I think an important point here is that moderation has the potential to be a lot more personal on Fediverse instances, as the ratio between moderators and users is a lot higher than traditional social media outlets (Facebook, Twitter).

Maybe AI has a place in moderation? I've always wondered why it isn't used more; if you give it an adequate training phase (a year?), it may be able to quickly identify and flag malicious content -- of course, you wouldn't want to ban whoever the AI tells you to, just use it as one of the signals for whether content could potentially be harmful.

Yes I think that's exactly right. In the ideal case you would want to moderate with a personal touch at the start and the use automation to scale that personality as needed.

I was thinking of AI in moderation mainly as signals. Like looking for synchronized activity or standard canned tactics and just surface them as signals or alerts to be looked at by humans. Basically make it easier to combat coordination and scale that is hard for any one moderater to see.

There is also the client side scanning stuff that Apple and others have experimented with. Basically try to warn users before they do something so that they're at least aware of the guidelines and leave it up to them whether they think they should proceed.

> I was thinking of AI in moderation mainly as signals. Like looking for synchronized activity or standard canned tactics and just surface them as signals or alerts to be looked at by humans. Basically make it easier to combat coordination and scale that is hard for any one moderater to see.

100%. It would be great if that could become public, too -- perhaps moderators could contribute to the model, maybe even automatically, through the software.

Though I'm unsure as to how you would prevent bias from entering the model. I feel like AI isn't used much in solutions such as these because you can't read what its been trained upon (e.g. if a right-leaning instance uses it, it may be biased against left-leaning content), and how the final model reacts to content. (Maybe there's a way to achieve this, not an AI expert by any means.)

It seems like it would be less transparent than the (arguably not great) solution we have now: shared blocklists.

Anyway, on the whole it would be great if we could take advantage of technology to reduce the administrative work required to host a public Mastodon / AP instance -- if we could achieve it, such work would most likely give way to more instances.

> It would be great if that could become public, too

Absolutely, that would be awesome

> Though I'm unsure as to how you would prevent bias from entering the model.

My only thought here is that something like the Hacker News model works pretty well (at least in theory). You would focus on norms of communication rather than on the content being expressed.

You'd still get bias for things like one community may prefer things very deferential. Another might value frank communication. But presumably nobody likes screaming or brigading. I think you're less likely to get left/right style biases if you focus on the quality of the communication rather than its content.

An approach like this would still miss important things. For example, you can say very toxic things in a civil voice. So you'd likely have to combine different orthogonal signals to have any sort of guarantee that your site isn't slowly drifting into a place where people know how to consistently violate the rules while evading detection.