Hacker News new | ask | show | jobs
by dchuk 7 days ago
Interesting. in terms of "crawling", the way the engine I built works is by default it's just polling the rss feed of a site on an adjusting cadence like any other rss feed reader. On some sites, the engine can do a follow up scrape of the article link from the rss feed if the full content of the article isn't provided in the rss feed. So it's not real crawling, more fetching/scraping if necessary.

But I hear you.