Hacker News new | ask | show | jobs
by marcell 654 days ago
Good question, I actually haven't tried it with the image capture approach. I'll give that a shot and see how it performs. I'm planning to try many different AI extractors, and see which performs best.

So far, I've done some un-scientific testing to compare text vs. HTML. Text is a lot more effective on a per-token basis, and therefore lower cost. However, some data is only available in HTML.