Hacker News new | ask | show | jobs
by bluecat22 3004 days ago
Can you elaborate why setting up a simple search engine based on common crawl was not useful?
1 comments

Basic keyword search is great at recall but precision (top 10) gets worse as the number of documents increases. Given the size of the web, basic keyword search tends to perform poorly in terms of relevance. Common Crawl is large enough to see this problem.

I think what OP and several people in this thread actually want is Google search minus synonyms and the ability to specify advanced syntax like AND and NEAR queries. I believe that would go a long way to satisfying someone who says they just want "keyword search".

You are exactly right, and specific keyword search allows you to weaponize a search engine so Google doesn't let you do it. (they have been victimized in the past for letting people specify things that could pull out social security numbers, for example.