Hacker News new | ask | show | jobs
by Tenobrus 2144 days ago
Does anyone know how I can filter out non-english posts from Mastodon? I set up an account and followed some people back in 2017, but it's unusable for me now since I can't read 70% of the content in any of my timelines.
2 comments

In your account Preferences, under "Other" there's a checklist where you can specify what languages you want to see on timelines. If you don't select any, there's no filtering.
it never really worked though.
It does work but not 100% which makes it feel like it doesn't. Language detection is very far from totally accurate (we use the CLD3 library), so sometimes it gives the wrong language. People can also just select which language they post in because when that's done in good faith it's more reliable than language detection, but of course it can also lead to some posts not being in the language they're supposed to be in. Using language filters cuts down on a lot of posts though, even if some slip through.
I'm on few international instances and always had German and French pop up even-though I have only english checked; Could it be that the instances themselves are faulty here?

I guess it makes sense that the filter would be confused on tech based talk as it includes a lot of english loan-words though it's a bit irritating when browsing public timelines.

Language detection is more accurate on longer texts, short messages have a lot of false positives and many English loan-words would only worsen the situation indeed.
What is really missing is a "Translate" button under non-english posts.
Who do you give the message for the translation? Big G? Babel?
It's a matter of time when free self-hosted neural networks trained for translation will appear.
Discussions leading to this comment might be useful: https://news.ycombinator.com/item?id=24007390

Basically a real translation require a conscious machine or at least one closely resembling it.

Translating some common words would at least tell me which toots might be interesting enough to manually feed to a state of the art translator.
I agree that perfect translation might require GAI, but don't let perfect be the enemy of good. An AI just needs to be better than existing approaches to be somewhat useful.
I'd be down for Big G. I'd expect some kind of plugin either user-side (I choose) or server-side (the instance chooses), though.