Hacker News new | ask | show | jobs
by makecheck 1491 days ago
Well whether you add “site:.edu” or not, Google still does the thing where they just spit out a pile of extra “matches” that don’t contain your search phrase AT ALL. Just thrown into the list, as if they have any business being there (oh wait, they do have business: Google’s business).

Frankly, if it’s not reliable as a tool, it has no real value and I can’t believe it has come to this. Imagine if every time you ran `ls` in a shell with a glob pattern, it just decided to sort of “add in” a few other files loosely based on your query (or heck, files that aren’t based on it at all)? Oh, now imagine if `rm` did that.

Sadly this happens with lots of search tools now. Why the heck is the default state on a new Mac to funnel what you type to everything, e.g. I searched for “Chrome” and hit Return and the FIRST thing it did was throw me into the App Store and call up some not-even-a-web-browser scam app with Chrome in its name, instead of selecting the Chrome already installed on my computer and opening it? More and more it seems that you have to turn off all kinds of poor defaults to put tools into a useful state, or there is simply no way to get them there at all.

6 comments

Unfortunately the fuzzy search thing is in a kind of feedback loop in terms of user expectation. Because of Google users are used to typing the wrong thing and getting the right answer. If I search for "how old is Roger Redford" Google tells me that Robert Redford was born in 1936 (it doesn't even display the "Did you mean...?" correction). And to be fair, in 99% of cases the person probably did mean Robert Redford.

But anyway, this is the behaviour I think people are being trained to expect from searches. I sometimes have to show new users our business systems (which manage residential property data) and it's seen as a drag that you have to have some level of precision when searching for anything.

It's like a spellchecker. As they've got better over time you can be less and less accurate with words you're not quite sure on and they still find the word you meant.

> the word you want

I was watching a new movie yesterday (The Lost City) with closed captioning on. The character said "synonym". The cc text said "cinnamon".

I see a lot of homonyms in the cc, but that one was the funniest.

In a presentation yesterday, I noticed “Jews” appear in the live closed caption stream of my talk.

It threw me off for a few seconds and I went back to the recording to figure out it took the word “choose” completely out of grammatical context to form a new clause of just “Jews”, as if I’d been speaking in complete sentences, then suddenly just decided to interject a random word between two half-sentences.

"There's a bathroom on the right"

"Wrapped up like a douche"

"When the rain washes you'll clean your nose"

The entertainment never ends!

Google is 24 years old. They track everything from email to searches to website behaviour across innumerable sites.

If they still need a feedback loop on their search results page, maybe they should hire some other guys to tweak their search engine!

They do, partly because SEO spam never sleeps and ever evolves.
Seems like the search market is ready for Google Search Pro, where you can use advanced operators and syntax to get exactly what you asked. Just $9.99 monthly, or buy Google One and get it free!
And just like a spellchecker, a suggestion of what it thinks you meant is welcome, but an automatic change is extremely not.
The loss of "Did you mean..." is painful. I think that little button effectively meaning "I MEANT WHAT I SAID" is great for the general use case of typos and foggy memory, and not having it is such a loss of usability
It's not just search. It's so much user-facing interactivity online. Stop black-box-inserting irrelevant search results. Stop black-box-modifying what I type in the document. Stop black-box-suggesting-as-I-type. Stop black-box-second-guessing what I meant.

I'm a perfectly capable human who can communicate my intent, and make corrections on my own if needed. What kind of deranged hubris must these engineers have to build systems that try to proactively act on what they think I actually meant?

Given that chat texts are used in court as evidence, I wonder what happens to this when the text is not what you typed, but what the AI system "corrected" for you.
Also, I wish Google could simply deliver the information I want without tracking me or gathering information about me, my interests, communication, travel, searches, or contacts.

Google should respect their users' privacy.

Please remember, you, who does a Google search, is a user, but not a customer.

Customers' interests come first, and the customers pay well for tracking and targeting.

> Well whether you add “site:.edu” or not, Google still does the thing where they just spit out a pile of extra “matches” that don’t contain your search phrase AT ALL

There used to be a 'verbatim' mode that would do exactly that. The quotes symbol were also a way to enforce verbatim mode.

Sadly this behaviour of "let me assume what you want" is not exclusive of google as I also now experience this on ddg.

As an anecdote, we have implemented a really strict exact match at Kagi (meaning quotes do exactly what they are supposed to do). We did receive some feedback that we should relax it a bit mostly because non-alphanum character matching (some users wanted them ignored) and occasional empty results page (some users were not used to getting an empty result page as Google almost always returns something, even if it is not what they searched for).
how capable is your Boolean searching? could I do `(catalog* OR classif*) AND (archiv* OR librar* OR museum*)`? does it support AND/OR booleans?
I would also be curious if you're able to filter results by site or country domain? (similar to now depricated google keywords with "site:foo.com" or "domain:.bar")
We could build this feature, you can suggest at https://kagifeedback.org
Yes you can. Star will work too.
What I don't understand is that the verbatim option still exists, it just doesn't seem to do anything. And quotes don't seem to work anymore either. It's all very insulting.
Regarding the quotes, I actually notice they do work. However the difference is that it doesn't highlight where the string was matched in the search result index page.

So for example if you google "Food", all the results _will_ contain the string "Food" somewhere in the page (if you don't believe me, try it). It can also match somewhere in the comments or metadata which is useless..

So the feature is still there but it's nowhere as useful as it once was.

I recall a Google engineer who confirmed this once but maybe one of them lurking in the thread can clarify it?

In my case quotes still work but they aren't as good as they once were
Quotes haven’t worked for about 10 years now. For my job in 2012 I very often had to google obscure part numbers to find documentation.

Even if you added quotes to the part number (such as “foo123-x”, google would return results for “foo234-x” or “foo123-y” and bold them as if they had matched. The real part numbers could be 10-20 characters long, so it was more difficult to spot discrepancies.

I learned very quickly not to trust the results even when adding quotes. If I had assumed the quotes had worked, I would have grabbed bad documentation without even realizing it.

You have to both quote the term(s) and then check the box for "verbatim". Then it should generally work. Have you all tried doing both at once?

edit: It is a pain in the ass though

What I mean is that quotes still improve looking up specific strings, but yeah, not as good as they once were and getting progressively worse.
I would be willing to bet substantial sums of money that Google tracks metrics for these alternative results and has concluded that they add value to the user. Maybe not to you, but to most people.

Personally, I'm ok with a bit of "oh, this is what I think you meant" rather than a literal interpretation of my query. It's not perfect but neither are my queries.

> I would be willing to bet substantial sums of money that Google tracks metrics for these alternative results and has concluded that they add value to the user.

And I would be willing to be substantial sums of money that any metrics Google has with respect to "adding value to the user" are actually more directly track "adding value to the business".

I would imagine that "adding value to the user" translates to "adding value to the business" in a lot of ways for Google.

If, on average, users perceive more value from Google, they use Google more, and Google earns more ad rev. Even if this lowers value substantially for the "relative few".

It's basic utilitarianism, and it sucks to be in the "relative few"...

I too, wish that quotes were respected, and Google would stop giving results for what it thinks I meant to search for like it knows what I meant better than I do - even if that is the case...sometimes.

That assumes people consider alternatives to Google viable.

Right now maximizing the number of searches people preform even if it degrades overall search quality is probably the goal.

has concluded that they add value to the user.

Not the user. To Google.

Marginal search results adds value for Google because it keeps people searching, which is how Google makes its money. If you find what you want, you stop feeding money into Google.

Google can do this because it has a virtual monopoly on search. If a strong competitor were to emerge, Google couldn't play these revenue optimization games, and would have to go back to being a search engine.

I'm ok with fuzzy searches, but there should be a switch for verbatim.
I certainly have moments where I don't quite know how to ask what I'm looking for and badly describe it in a query: e.g. "flat thing you use for cooking" - google seems to understand what I'm looking for, whereas bing/ddg guide me more towards cooking tools with the word "flat" in them.

I usually use ddg but I do find google useful for the more weird queries I have.

Open source alternatives are the cure (in the case of Mac hacking their search results to drive some arbitrary metric so some random PM gets promoted somewhere)

A lot harder to track me using linux instead and constantly pushing my company to allow people to use linux machines for dev

> Frankly, if it’s not reliable as a tool, it has no real value

This is exactly the ridiculous kind of hyperbole I come to HN for.

If you honestly can’t find any value in using Google despite its unreliability, you were probably expecting too much of it to begin with.

Your expectations are too low. We all remember google in it's heyday when it returned honest results, so we know what's possible.
That was when silicon valley was still run by computer nerds. Those days aren't coming back buddy, time to move on
I remember the days when internet as a whole were simpler in terms of contents and users, and search engines reflected that simpicity. I am not sure if it's possible to go back to those days.
If only there was a large, old search engine, that still had a web index from the days before SEO spam, that it could prioritize over newer results.
Or a search engine with zetabytes of data and the world's largest collection of AI PhDs. It could be possible that they have the advantage over some scam artist smoking cigs in front of a screen at 3am, considering it's their system.
A more complex web landscape doesn't explain why the entire first page of results is very often just ads and e-commerce sites.
> we know what's possible.

We know its possible to do in the Internet of 2004.

The Internet of 2004 no longer exists, the Internet of 2022 is a very different problem for search to solve.

I wonder if google could even return to its heyday if it wanted. Even if they stopped playing dumb games with search there a alot more malicious actors whose entire career is to fuck up search results now.
I feel like low-quality advertising results, low-quality-match results, and scammy results are at least 3 different bad search result types to get, and folks here are talking about the former

of course, google gets paid to show them, and I don't think anyone here realistically expects google to voluntarily stop intentionally fucking up search results unless it somehow means a larger paycheck to them

Reducing yearly revenue from 250 billion to 200 billion seems like a reasonable tradeoff for not hobbling and annoying more than half the population of the earth and also maintaining brand longevity. Right now they are further and further ripening for disruption by an upstart.
Ah yes, the good ol' days of content farms copying from stack overflow, or did you mean the good ol' days of buying back links? Or the good ol' days of web rings?

Let's face it, the heyday of honest results are just as mythical as the political "good ol' days".

> the good ol' days of web rings

I honestly don't get your snark. Those were the good old days of the Internet, when SEO spam and PageRank wasn't a thing. Then Google came and made the Internet even better, then it turned to shit, and here we are. And web rings were great, you take that back.

So yes, there was a period of time when the Internet was in some ways better than nowadays. Now I'm not saying that everything that happened since is bad, but Internet search and signal-to-noise ratio has definitely fallen off a cliff.

Let's enjoy the good days of siloes and SEO optimization. I doubt it'll get better.

Eh, there were a few periods of "glory days". Usually in the months after Google's latest crackdown on one of the SEO techniques you mentioned.
Notably, around `08 or `09, these cycles stopped. From the outside, it looks like Google stopped trying to fight spam and just gave a bunch of huge sites a permanent boost in rankings instead, so at least some of your results might not be spammy garbage.
Also honest content moved into video and podcasting formats (that are harder to index) or inside walled gardens like Twitter or Facebook. It's like if the internet was a sea and Google was the ship used to navigate it, the water has dried up and now it's just craggy shoals we have to navigate around to get anywhere.
google.stanford.edu :)
https://altavista.com it runs on the latest top end Alpha processors.
http://altavista.digital.com

I honestly switched to google from alltheweb and altavista initially because of the easier name and muscle memory taking hold. To this day I believe Altavista was better when google first took off.