|
|
|
|
|
by tentacleuno
953 days ago
|
|
> I was thinking of AI in moderation mainly as signals. Like looking for synchronized activity or standard canned tactics and just surface them as signals or alerts to be looked at by humans. Basically make it easier to combat coordination and scale that is hard for any one moderater to see. 100%. It would be great if that could become public, too -- perhaps moderators could contribute to the model, maybe even automatically, through the software. Though I'm unsure as to how you would prevent bias from entering the model. I feel like AI isn't used much in solutions such as these because you can't read what its been trained upon (e.g. if a right-leaning instance uses it, it may be biased against left-leaning content), and how the final model reacts to content. (Maybe there's a way to achieve this, not an AI expert by any means.) It seems like it would be less transparent than the (arguably not great) solution we have now: shared blocklists. Anyway, on the whole it would be great if we could take advantage of technology to reduce the administrative work required to host a public Mastodon / AP instance -- if we could achieve it, such work would most likely give way to more instances. |
|
Absolutely, that would be awesome
> Though I'm unsure as to how you would prevent bias from entering the model.
My only thought here is that something like the Hacker News model works pretty well (at least in theory). You would focus on norms of communication rather than on the content being expressed.
You'd still get bias for things like one community may prefer things very deferential. Another might value frank communication. But presumably nobody likes screaming or brigading. I think you're less likely to get left/right style biases if you focus on the quality of the communication rather than its content.
An approach like this would still miss important things. For example, you can say very toxic things in a civil voice. So you'd likely have to combine different orthogonal signals to have any sort of guarantee that your site isn't slowly drifting into a place where people know how to consistently violate the rules while evading detection.