Hacker News new | ask | show | jobs
by p0w3n3d 207 days ago
That's nice, however I'm concerned with people with sight impairment who use read aloud mechanisms. This might render sites inaccessible for them. Also I guess this can be removed somehow with de-obfuscation tools that would be included shortly into the bots' agents
2 comments

you are correct. This makes text almost completely unreadable using screen readers.
I just cracked open osx voice over for the first time in a while and hoo boy, you weren't kidding. I wonder if you could still "stun" an LLM with this technique while also using some aria-* tags so the original text isn't so incredibly hostile to screen readers. Regardless I think as neat as this tool is, it's an awful pattern and hopefully no one uses it except as part of bot capture stuff.
Do screen readers fall back to OCR by now? I could imagine that being critical based on the large amount of text in raster images (often used for bad reasons) on the Internet alone.
no, but they have handling of unknown symbols and either read allowed a substitute or read the text letter by letter. both suck.
Sounds like a potentially useful improvement then.

I've had more success exporting text from some PDFs (not scanned pages, but just text typeset using some extremely cursed process that breaks accessibility) that way than via "normal" PDF-to-text methods.

no, it is not. simple ocr is slow and much more expensive than an api call to the given process. on the positive side, it is also error prone and cannot follow the focus in real time. no, adding ai does not make it better. AI is useful when everything else fails and it is word waiting 10 seconds for an incomplete and partially hallucinated screen description.
> simple ocr is slow

Huh? Running a powerful LLM over a screenshot can take longer, but for example macOS's/iOS's default "extract text" feature has been pretty much instant for me.

<< Also I guess this can be removed somehow with de-obfuscation tools that would be included shortly into the bots' agents

It can. At the end of the day, it can be processed and corrected. The issue kinda sucks, because there is apparently a lot built on top of it, but there are days I think we should raze it all to the ground and only allow minimal ascii. No invisible chars beyond \r\n, no emojis, no zero width stuff ( and whatever else unicode cooked up lately ).