Hacker News new | ask | show | jobs
by FrenchDevRemote 654 days ago
how do you deal with the fact that some basic pages can have tens of thousand of tokens?
1 comments

Right now, not much. The extension is fairly basic in that is just looks at the raw text + HTML and sends it to the LLM.

The benefit of this approach is it's very simple and easy, but the downside is it sends a lot of unnecessary tokens to the LLM. That drives up the cost, slows things down, and hurts accuracy.

I'm working on a few improvements now to improve this.

I remember there was something called readability for chrome which is just what browsers have incorporated as reader view. And mozilla even had a stand-alone version of it (1). Might be of interest to you.

[1] https://github.com/mozilla/readability