|
|
|
Ask HN: How would you build a search engine in 2019?
|
|
35 points
by throwaway13000
2448 days ago
|
|
So, I was wondering how to build a competitor to Google. We have Common Crawl and Internet Archive. Distributed systems is pretty well understood. How would one go about buiding a search engine in 2019? Would you do what duckduckgo did, which is to use some body else's index and ranking or would you just build your index from commoncrawl? How are Ecosia and Startpage.com able to stay profitable without doing either? Does that mean we can have many niche search engines? Can we crawl CommonCrawl and build an index for less than $10K (what one individual can do out of pocket.) |
|
I think my takeaway from the last 10 years, is that a lot of the info on websites that was by real people has disappeared and you have a lot of spammy blog and heavily commercial approaches. And, most of the real info has gone into facebook groups, quora, reddit comments, slack, twitter, and so on.
The problem is those are all closed-door eco-systems in a lot of ways, and the knowledge is hard to differentiate from the temporal messaging.
I think if I was going to approach this I would build software that users run, or browser add ons that lets user tag and save information in some type of format, and then that contributes to a knowledge search information.
For example, I am a member of several FB groups focused around specific expat groups for where I live. There are great pieces of wisdom and hard to find info in there. I'd love to with a chrome extension say save this and here is a little context (or if it could know that is great from formats).
Then try to figure out how to make that public and searchable.