Hacker News new | ask | show | jobs
by krazydad 3774 days ago
> "Imagine what your perfect source would tweet, or what you yourself would tweet in that situation, and search for the words that would probably be in it."

This is a technique I used to use a lot more in the days before Google became the dominant search engine, when I was using Lycos, Alta Vista and Yahoo. In those more primitive search engines that were doing something closer to a full-text search, it was important to use the words and verb tenses that were likely to appear in the answer (or target page/tweet), not in the question.

So for example, instead of querying "Which museum is the Mona Lisa in?" it was best to query "collection includes the mona lisa" and similar phrases.

Needless to say, third generation search engines like Google made this all unnecessary, and hopefully, Twitter will get there too, eventually.

1 comments

This is still how web searching works. I have no idea why you would think otherwise--at its core Google is still fundamentally an indexing service, and you use indices by figuring out which words may be in the documents you want to read and then finding the documents containing those. Without doing this, it becomes impossible to distinguish documents which are what you want from documents which describe what you want (and which often refer to something behind a paywall or in a private or non-digitized collection).

    > This is still how web searching works. I have no idea
    > why you would think otherwise
I imagine it would be days worth of experiencing success with such 'question searches'.
I dunno...

https://www.google.com/search?q=how+can+i+pirate+movies

Seems to have no results about pirate movies.

The only result, of the 10 on that page (right now, for me), that answers the question is https://torrentfreak.com/mpaa-bittorrent-is-the-best-way-to-...

But if you search "using * to pirate movies" (google will match the star, even in quotes, with a word or phrase in result context), every result will include the name of a method for pirating movies, which you can investigate further. I see "popcorn time", "bit torrent", "pirate bay" and "camcorder" on the first page. These are all direct answers that a human would consider reasonable (though of varying efficacy) to the question.

I mean, but if you only need one link that answers the question, 1/10 is not any less useful than 10/10.
Now I can't help but wonder how different the world would be if TPB was only about pirate-movie torrents.
> I have no idea why you would think otherwise

It might be due to better NLP, but it may also be due to the rise of blogging-style personal content where authors say things using that phrase. For example, "I was wondering how to start up my own Ubuntu server on AWS and thought I'd write a tutorial while I was at it." contains "how to start up . . . Ubuntu server on AWS"