Hacker News new | ask | show | jobs
by sjdz 1381 days ago
It's a nice idea! One consideration would be how to grab the html of the pages. In my experience, using either fetch() or an ajax request often runs into problems with CORS etc on the destination site blocking the request.

Maybe there's a better way someone knows for extensions to grab remote html without running into these problems?

The alternative would be for the extension to grab the html via API from a crawler running on a server (or SaaS), which should work pretty well.

2 comments

Yes, I'm pretty sure (but not 100%) that extensions normally can avoid CORS/same-origin/etc issues.

> The alternative would be for the extension to grab the html via API from a crawler running on a server

Oh like 'https://myplainwebsite.com/parse?url=https://example.com/doc...'

So I just launched this inspired on our brief conversation! :)

https://content-parser.com/

You can parse any URL into markdown with `https://content-parser.com/markdown?url={encodeURIComponent(...}`.

That's awesome, works really well! Good luck with it :)