Hacker News new | ask | show | jobs
by amelius 2869 days ago
With modern ML, that is not true. For example, a browser could easily run OCR on the frame-buffer and read it out loud.

In fact, the approach would be more general, since a lot of text is already embedded in images on a lot of websites (think also about ads).

3 comments

In other words: "Print the file, send the printed sheets by fax to the remote office, once received, OCR them and save to a file"

Not to mention that there's no such thing as "Easily run OCR", considering how many weird fonts there are, and that they could be rendered in such small size as to make it difficult even for humans to tell them apart.

If you look at any platforms accessibility APIs, there’s a lot more to it than just exposing text.

The simplest example: how do you indicate a graphic or piece of text is a button or link? How do I set the alternate text if it’s a graphical button.

The render engine still has to do that. But that isn't much different from how it works now, where the website has to provide the necessary information.

Besides, nobody says we can't have a protocol for some additional accessibility information.

And then somebody says "how about we abstract this into some kind of markup language?" and we've come full circle.
A common markup language would be great! As long as we use it in a way that keeps us far away from compatibility problems.
But you have to make it extendable because you can't predict every possible use-case. So some kind of Extensible Markup Language, perhaps.
Maybe we make a sort of 'hyper' extension that comes with all kinds of baked in behaviours, so that we don't have to implement that behaviour ourselves when we deserialise it.
You only get away from compatibility problems if either:

1. No-one uses it

2. There is only 1 implementation, and it is never updated

> For example, a browser could easily run OCR on the frame-buffer and read it out loud.

And then we wonder where all the software bloat comes from.