Hacker News new | ask | show | jobs
by akavel 4293 days ago
The point is I want some heuristic that would work "automagically" (like Readability, etc), not requiring me to invent a tailor-made xpath for each and every such website in the world.
2 comments

Try this:

http://fivefilters.org/content-only/

It has a default extractor, and site-specific recipes use the same format as Instapaper, so you can leverage the work Marco has done on different sites.

Oh, alright.

If there is such a thing I'd be interested to learn about it myself. TBH "tailor make an xpath for every site" is the best solution i'm aware of.