Hacker News new | ask | show | jobs
by bomewish 815 days ago
What crawling framework you using?
1 comments

In-house made in Elixir.

20% of a crawler is fetching and parsing pages, the remaining 80% is dealing with misconfigured, broken and non-standard web servers and HTML. Dealing with Cloudflare, Akamai and random bot-busting tools that cause more false positives than a chaos monkey. It's better to write one yourself that you can control, monitor and operate as you need, instead of relying on third-party logic. Makes sense for my business, at least.

Ah. Have so been there. But don’t really have the resources to spin something from 0. Good luck!!