Hacker News new | ask | show | jobs
by reality_czech 2902 days ago
Yahoo hasn't had its own search engine for years. In 2010, they became essentially a frontend for Bing. In a later 2015 deal they switched the backend to using Google.

Duckduckgo is a metasearch engine, technically, but mostly it delegates to Bing.

As far as I can tell, there are only two and a half real search engines that still exist: Bing, Google, and Wolfram Alpha. (I count Alpha as a half because it's not really what most people are looking for.) I'm curious if anyone else knows of other real search engines still in existence.

7 comments

Yahoo still uses Bing for majority of their searches. https://arstechnica.com/information-technology/2015/04/micro...
Check the date on that again; it doesn't contradict the parent comment.
https://searchengineland.com/yahoo-google-search-deal-233963

Unless something has changed it seems bing still gets at least 51% of traffic.

I just tried hotbot and it seems to be better for programming-related questions than duckduckgo.

I wonder if they use mostly Google for the backend.

Baidu, Yandex, Naver, probably quite a few others.
What percentage of traffic comes from Bing though?
Woah woah woah Duckduckgo delegates to Bing? If many/most searches are unique that means they can't live up to their privacy statements.
If they hide the users IP, HTTP headers etc. and proxy searches to Bing through their own servers it would be anonymous.
Unless your query leaks private information.

Or a series of unique queries could leak private information.

Bing would be unable to associate series of queries with users.

As long as DDG are doing it properly (and I believe they are), Bing would only learn that the contents of each individual query are associated together, they would learn nothing about which other queries were performed by the same user.

I think the concern isn't necessarily that Bing would associate query X with person Y. The concern is that Bing would even know that query X exists. For example, if Bing saw a spike in searches for "Aramco IPO July 4, 2018" and were to reveal it to a human or store it, that might be a serious leak of non-public information. Many searches reveal private information, even when they aren't associated with a user.
> if Bing saw a spike in searches for "Aramco IPO July 4, 2018" and were to reveal it to a human or store it, that might be a serious leak of non-public information

Maybe I'm missing something obvious here, but how is that any different from Google or DuckDuckGo seeing the same spike?

There's no practical technology that would get around that problem.

Homomorphic encryption might do the trick (?), but it's too slow at the moment.

thta has little to do with privacy though. and i believe thats the only thing they assure?
I think that’s pretty far fetched. Such a spike would most likely be the product of an already very well known rumour.
IIRC DDG delegates the crawling to Bing, but does the actual searching itself.
I really doubt it. They never tell what happens in the background. Probably because it would spoil their "magic".
How would you do that with the Bing API? Really don't think that's true.
Another half (read: limited-scope) search engine that springs to mind is Shodan.
Did Yahoo ever have its own search engine, technically? In the early web it was a directory maintained by humans, which made sense at a time when the total number of pages in existence on any given subject was no more than a few hundred; I thought that when that era passed they went straight into licensing other search engines' results.
I worked at Google on search indexing at the time Yahoo switched from their own search engine to using Bing. At the time, by most of Google's own search metrics, Yahoo had a product superior to Bing. If Bing had been spun off as a separate company, or otherwise hadn't had access to Microsoft's deep pockets and default IE search status, it's likely Yahoo would have fared better.
I was at Yahoo during that time, although not in web search. From what I could tell, company leadership was frustrated with lack of growth in search market share, and didn't want to invest in it anymore.

Yahoo was running user studies where they would put Google results and Yahoo results side by side but switch the branding; while Yahoo results were ranked better than Google for most of the tested queries, results with Google branding ranked better than with Yahoo branding, regardless of whose results they were.

The plan was to just use Google, but the DOJ (or FTC?) put out guidance that that would be anti-competitive, so Bing was it. This might have worked out anyway, but the expected cost savings from outsourcing search didn't actually happen that I saw, but I left in late 2011, and stopped following closely after that. Web search was also linked with search ads, which Bing did poorly at too.

Google also ran similar user studies, sometimes between Google and other search engines, and sometimes between production Google and a proposed change.

One tough thing is there isn't one search quality metric. It's important to have the search results page look good with its snippets, and another thing to have people actually look at the linked pages and compare the usefulness of the linked pages.

Common vs. uncommon searches are also important. It's not difficult to write a search engine that badly over-fits on the most common searches. However, for market share, it's important to do well enough on the common searches that users don't leave, and do well enough on tough long-tail searches that you pick up users that leave other search engines on tough queries. The idea is to be pretty good at the common searches, but the best at the kinds of searches that cause people to try other search engines. Naive frequency-weighted metrics will get this totally wrong.

It's also more important to get useful information in the first 2 or 3 links. If Google links to the second-best link at result #1 and puts the best link off the first results page, but Yahoo puts the best link down at #7 and second-best at #8, the user may lose interest before following a really good link.

I don't think Google took the union of front-page search results between two competitors and asked humans to hand-order the (up to 40) pages for how well they fit the query. But, that seems like a good way to test the actual usefulness of search results. You'd probably especially want to keep track of the percentage of the top 3 search results that were filled by top-5 (guessing at 5) useful links.

Anyway, inside Google it was well-known that Yahoo was the competitor to worry about in terms of search quality.

I would note it is possible (and even likely?) that each search engine performed better for its own traffic-weighted query stream.
Nope, they had a spider-crawler and a full engine along with the human-curated Directory.
The sequence was: Yahoo Directory -> Yahoo showing Google results -> Yahoo builds their own search engine -> Yahoo uses bing
Yahoo search was provided by Altavista for some time as well.
Yes, before they used google. It's a pretty interesting story, actually, how Yahoo felt that they should use the best underlying search engine with a "white label" approach, and how Google succeeded in eventually building a very strong brand despite being invisible.
Didn’t they use inktomi for a while?
Yes, and then they bought inktomi.
Baidu