Hacker News new | ask | show | jobs
by vivekseth 2102 days ago
There’s a little more nuance than that. Even if text is drawn using plaintext data there’s no guarantee that the characters/words appear in the correct order or have the proper white space between them.
1 comments

The best method is probably to render the PDF and use OCR.
Unfortunately that's obnoxiously inefficient if you're trying to run it through text-to-speech in real time.