Hacker News new | ask | show | jobs
by amadsen 1967 days ago
My vision is good and I also don't know if the word you quoted was supposed to be in English, French of Irish? The point being, a screen reader has no more or less information than a human reader.

It might be better then to invest effort in improving language inference heuristics, which would help in all applications (pdf documents, text files) than to try to build in support in each underlying protocol.

4 comments

But the screen reader needs to say the word, and the pronounciation of "coin" isn't the same in every language. I don't know about Irish, but it's radically different between French and English, the only sound in common is the "k".

This means that if the screenreader picks the wrong language, then the user has to guess the spelling from the way it's pronounced to understand the text.

Though I'll concede that right now on the web it isn't much better; but that's a failing of publishing tools, not the format.

Seems like the simplest fix is to allow the user to ask the screen reader how a word is spelled.
Not unthinkable that's how a human will read foreign language too.
> a screen reader has no more or less information than a human reader.

Indeed, so why not let the human writer clarify it?

Why do you prefer a resource intensive guessing algorithm that will never have a 100% accuracy over just having an annotation that takes less than 10 bytes and is trivial to parse?

In my opinion there's no reason to even consider the first option, especially with gemini's focus on simplicity in mind. This "what can another little JS library hurt" attitude is what lead to gemini in the first place.

Add language inference heuristics to cover all kinds of formats and now you've got a dependency. Then some people need different settings so you add config files and parsers/dependencies for those. At some point you update the model and now it can't properly differentiate Mandarin, Cantonese and short Japanese phrases anymore. You start adding cross-platform support and other features, and now some people say it's too slow on their Raspberry Pi Zero setup. Bloggers complain as they have to rearrange some quotes via trial-and-error so the heuristics pick up the correct language. Unfortunately that makes it worse for people running the older version 0.7.

After this rant it should be obvious, but: I prefer just adding a ["en"] or similar and stop worrying about it.

PDF actually has support for language tagging, and using it is a recommended part of any guide to accessibility for the visually impaired.

It may be that the language which a quoted foreign-language word is meant to read in was mentioned far back in the text. Human sighted readers will know how to read it, but computers won't yet (and perhaps won't for many years if not decades). However, there are visually impaired people now who need to be guaranteed the same Gemini experience as sighted users.

But some unicode characters are supposed to be rendered differently based on the language of the text that contains them. So this (language tagging) is required even for getting the thing rendered correctly.