Hacker News new | ask | show | jobs
by Y_Y 966 days ago
Sounds like a problem with the screen reader. If they are supposed to replace the part where the user interprets a glyph on the screen, then they should act like a human would and interpret something that looks like semicolon as a semicolon, even if it's a Greek question mark (excepting at the end of a Greek question).

(Do any of them have an OCR layer? The context sensitivity might be more challenging, but probably can be specialised to common cases or LLM-magicked away.)

2 comments

The problem is that use of unusual Unicode characters represents memes, there are constantly new ones and it would be hard for any screenreader to keep up with new stuff. And there are plenty of imaginative use of letters from Indian scripts, from hieroglyphics, etc. where the screenreader can’t be expected to recognize what is intended as easily as a human being would.
It isn't (just) a problem of the screen reader.

This is hacking unicode to do things that unicode isn't supposed to try to do.

<List> <Item> element ....

Tells me that this is a list of things.

Reusing a unicode thing that just happens to look like a dot doesn't give context to the screen reader. It doesn't see a fancy bullet point it sees U184638 'libyan double sigilled C' or what ever.

You're then relying on the hearer to know that U184638 looks like a fancy bullet point.