Hacker News new | ask | show | jobs
by JustARandomGuy 1306 days ago
A good, reliable api to archive web pages in multiple formats and extract the article text. The best are very expensive - look up Diffbot, it’s the best I’ve found but at $299 a month, it’s expensive. I’d happily pay per web page extracted, but the upfront minimum is too much.
1 comments

This is interesting. So the API would take a URL and return the article text? (sorting out what from the site is "noise" like navigation, and what is the article?) Is this the same sort of thing that a web clipper like Evernote has, to simplify an online web page, or is there more? What is a use case that would be valuable for you?