Hacker News new | ask | show | jobs
by aemreunal 1556 days ago
> The same way that Google improperly handling quoted searches and returning pages _without that exact string_ is somehow a feature, not a bug.

Couldn't find it now but that was refuted by Danny Sullivan (I think) in an HN thread about the Google search results quality. They gave a pretty convincing explanation of why that may happen (spoiler: it has something to do with the tokenization of words on the website) and I, for one, believed them.

*Edit:* Here it is: https://news.ycombinator.com/item?id=30356382

3 comments

Yeah, I remember that – he basically said that the word might appear on the page but not in the results description. And that's BS, because it used to be that the description of every result contained at least one bolded token included in the search query.

If this is no longer possible, it's only because they're cheaping out on hardware and not actually indexing as much of the page as they could (no way to retrieve the matching token because it's not saved, but they can still match on it). This probably also leads to diluted results.

That's just another proof that google doesn't care about it's users.

If they quote a search term they expect to find the exact term visible on the website. Not in alt text, not in invisible text, not any meta data. Quotes means this exact phrase visible on the website.

I see that term when I look at the site.

Of course, that's because my user agent renders alt text. I'm glad Google search results match to things I can see on the page.

Doesn't the comment right below the one you linked successfully refute Danny Sullivan's claim?
The one by drawfloat?

Looking at the source of the cached page,

https://webcache.googleusercontent.com/search?q=cache:sRJX_e...

It contains

       <a href="/quotes/tag/don-t-give-up-quotes">don-t-give-up-quotes</a>,
       <a href="/quotes/tag/don-t-give-up-the-fight">don-t-give-up-the-fight</a>,
After you eliminate HTML, that becomes "don-t-give-up-quotes, don-t-give-up-the-fight" and since punctuation is stripped, that matches.

Full disclosure I work at Google, but not on Search.