Hacker News new | ask | show | jobs
by thrwawy283 1553 days ago
You mentioned things I hadn't thought of. Google's Search accomplishes the goals of 10 years ago, but steps no further than that. It treats its power users like kids, and offers no complex filtering to do things like removing search results that require logins. Librarians love when you come to them to specifically refine your search. Google still has the most useful search, but they've taken away methods to get better results. I remember I was pretty upset when i couldn't search for images by exact dimensions anymore. Bing allows this.

Google's product direction has been inching backwards for a decade.

5 comments

>It treats its power users like kids

It's worse than that: Google's power user features used to work reliably and repeatably. Now Google tries even harder to figure out what you "want" and filters you results invisibly for you. You can't turn this feature off, and are are unable to easily or obviously avoid it.

I've recently noticed that Youtube has a similar feature. If you search for a video, you'll only get a small number of results before Youtube will start showing you "recommendations" which are only somewhat related to your original search. Somewhat ironically, the only way to avoid this is to query via Google (site:youtube.com [term]) where you will get a much larger set of results.

It just seems that raw search is disappearing, and "recommendation engines" are appearing everywhere.

> Somewhat ironically, the only way to avoid this is to query via Google (site:youtube.com [term]) where you will get a much larger set of results.

Even more ironically and many here would have experienced this themselves: the best way to find something on YouTube is to use Bing

> the best way to find something on YouTube is to use Bing

Weirdly the same is the case for reddit.

What I really want I guess is, a search engine, where i can provide you sites to index, and when I am searching, I only search through those sites. That's it.

What about discovery of sites you don’t know about. The whole purpose of he internet is that we’re all connected, but you seem to want only an extended private network.
A surprsingly large amounts of time I use Google, is to mainly to search on either reddit, stackoverflow, hackernews etc. Even searching for other sites is helpful using those sites i.e. searching for developer blogs gets far better results if I add reddit to it.

I think Google has it's place. However, I also think that an additional search engine like the one I described would be a very nice and useful tool. At least for me.

Google fails miserably over there too.

First page is full of results from a handful of sites.

> Now Google tries even harder to figure out what you "want" and filters you results invisibly for you.

Google trying to interpret what you want is, in my opinion, the largest reason that the search has become so bad. It guesses very poorly, and I end up having to try to guess what the magic incantation is to get it to give the what I'm searching for.

"Now Google tries even harder to figure out what you "want" and filters you results invisibly for you. You can't turn this feature off, and are are unable to easily or obviously avoid it."

This. So much this.

And it is so...bloody...annoying.

> Google's Search accomplishes the goals of 10 years ago, but steps no further than that.

Google has removed features that it had 10 years ago.

I still don’t get why. It cannot be so difficult for them to keep things like literal search, can it? What is the incentive to remove it and replace it with a needlessly more complex almost literal but still fuzzy search?

I do suspect the main thing people complain about currently with Google is the abundance of ads and the algorithm that has encouraged stupid amounts of articles of a certain length. Recipe for baked potatoes is now 2000 words long.

> It cannot be so difficult for them to keep things like literal search, can it?

Greater scale = greater cost of keeping data hot in their search data-warehouses (esp. in light of contention over memory/caches.) Keeping around both a source-text string and its tsvector representation (or whatever Google's version of that is) is a "thing that doesn't scale" that they could provide at 1B queries/day, but probably not at 10B queries/day.

> the algorithm that has encouraged stupid amounts of articles of a certain length. Recipe for baked potatoes is now 2000 words long.

That's not the algorithm's fault per se; that's instead the fact that recipes can't be copyrighted, and so these sites can freely steal + repost one-another's recipes, and so you'll find the same recipe word-for-word on many sites, thus making an exact match in the recipe part not contribute highly to ranking any particular site. The 2000-word blog post, on the other hand, is actual Intellectual Property unique to the site posting it. So it only appears in the one place; and so when your query matches it, it ranks quite highly indeed.

> That's not the algorithm's fault per se;

Yes, it is. There are good recipe sites out there with authoritative, reliable content and fast loading times. Google says it prioritizes those things, I can identify sites that have them, and yet the algorithm doesn't favour them. That's the algorithm's fault no matter what memes about copyright law cause a proliferation of shitty websites.

What I'm saying is that the "recipe" part of a recipe website is a commodity – there is no "authoritative" source for a given recipe, unless that recipe is too niche in appeal to end up widely disseminated. This video (https://www.youtube.com/watch?v=SsNLzyqqINw) has a pretty good coverage of the topic.

Compare and contrast: phone-number directory listings. Who should Google cite as the authoritative source for lists of name-to-phone number associations? Nobody. All the lists are copying from each-other, curating and correcting the data taken from one-another, gathering their own original data for additions, and everything in between. Every portal overlaps every other portal, but mostly has the same stuff.

Compare and contrast, in the physical world: printings of public-domain literature. If Google indexed bookstores, which printing by which publisher would you want them to rank first on a search for e.g. Pride and Prejudice?

Try Kagi.com, you can rank domains however you want
What I really want is biased search results of my choosing.

$10 a month for a personal search is a bit much. $10 a month for work related search is cheap. Give me results specific to my industry without having a super long query.

(Neeva team member here) re: recipes. You might like the Neeva recipe search experience. You can see an entire recipe and reviews (without the ads or intro text) without navigating away from the search results page. Quick example here: https://neeva.com/search?q=baked+potato&src=nvobar
The last time this came up, Google demonstrated that it still worked. Most of the examples of it not working people tried to provide are actually just unexpected exact matches in the HTML that the standard user doesnt see, so they seem like false positives or "surprisingly good" results not based on the page content.
Approximate match allows you to sell approximate/related-match ads
Exactly. Even within the ad platform they keep pushing advertisers to target ‘broad’ keywords instead of ‘exact match’ ones.
Brilliant observation, never thought of it. Now the quality degradation kinda starts to make sense.
> What is the incentive to remove it and replace it with a needlessly more complex almost literal but still fuzzy search?

Control. They've moved from helping you find what you asked for, to trying to influence you to changingnwhat you ask for to the thing that paid them the most.

Similarly they're they're forcing creators to alter content to match their metrics or fall into obscurity.

Because somebody could have crafted a superior search using their refined search as an API, destroying the Add Revenue
They didn’t remove literal search. Put your literal in quotes.
This stopped working reliably some time before last year.
Eh, it's not so much that it stopped working as it is that it never worked the way you thought it did.

Quotes have ~always been an exact match on the tokenized query text, not a substring match on the corpus text. No synonyms, reordering, gaps, etc, but the matches -- and failures -- are sometimes not obvious at first blush.

If you search for "don't stop me now", for instance, that "don't" tokenizes to "don t", so it will match the tokenized strings "don't", "don t", "don-t", "don, t", etc ... but not "dont", because that's outside tokenization.

On the other hand, snippets mostly are substring matches of the query text, so if you see a result to a literal query that doesn't have a snippet, you know it's probably one of the weird matches.

This is just patently false in addition to being condescending.

If you use quotes around a phrase, it will reorder terms and make substitutions with synonyms in addition to straight up ignoring the quoted phrase no katter how many times you add +. If you then fiddle with settings (randomly not available depending on star alignment and device) to change it to 'verbatim' it will still reorder and split up tokens in the phrase.

Why is it that exact searches used to actually work reliably, then? What's changed?
Honestly, Yandex has really good image search. I now do all three (Google, Bing, and Yandex) when I'm researching for a design.
Going backwards and standing still often look the same. I think its both in this case to varying degrees. Google is competing against itself, obvious competitors but also obvious refinements that appear so regardless of actually being implemented. The no-login filter is a refinement that would be useful to many.

There are so many obvious improvements that could be made to gmail but there's no real way to do them as a consumer.

Googles priorities have been long shifted from products to money. They are now deliberately doing evil things for more money.

It is like taking small dose of drugs for fun, for sure it won't kill you immediately but it eventually will as people can hardly resist the temptation. Google has been came such a monster, even we know it's dying, but its momentum will keep it going for a very long time. And if it can correct its trajectory just a little bit, the collapsing process will be even longer.

> It is like taking small dose of drugs for fun, for sure it won't kill you immediately but it eventually will as people can hardly resist the temptation.

Very debatable.

No fun allowed. Or the implication is taking drugs for anything but recreation won't kill you. I need more guidance!
Taking drugs for fun won’t necessarily make you an addict or kill you.
Hmmm, only if you know when to stop and can actually stop, sure, no it won't kill you. But what if you cannot stop? Most importantly, I did not even mention which drug as those was never the point.
I agree with you but I think the problem is that majority of internet users are casual users who don't care about complex queries. Maybe good business idea would be to make internet search engine which enables complex queries like the one you described. Google was made for masses not for power users.