Hacker News new | ask | show | jobs
by trustfundbaby 2006 days ago
What do folks think about implementing a web crawler that you can send to a website and it indexes every internal url on the site. I remember sitting down to write one 100 years ago now, and finding it to be much trickier than I thought it would be.
1 comments

that's interesting. grab only the URLs, not the content?
right, but of course to do that, you'd have to grab the content to parse it :)