Hacker News new | ask | show | jobs
by gregw134 873 days ago
Wanted to say congrats on launching! I'm building a search engine myself, I can tell a lot of work went into this.

I think the biggest thing you overlooked are page titles. When you issue a query it's a bit hard to quickly scan and judge what a site is about because the page titles are missing.

1 comments

How do you crawl the web? Do you follow links around? How do you reach a page that isn't linked from anywhere you've crawled?
I'm just using common crawl for now
I mean that's what web crawling is, right? By extension, you just can't reach a page unless you stumble upon a link to it _somewhere_. Google gives you an option to submit a link and schedule a crawl that way, so that's another option if it's not being linked to from anywhere.