Hacker News new | ask | show | jobs
by LeoPanthera 3356 days ago
This article doesn't seem to say, so, how many of these search engines use their own index and don't "borrow" one from another larger engine? DDG uses Bing and Yandex, for example.
1 comments

I'm willing to bet no one* does their own search crawling aside from companies the size of Yandex and larger. Google carefully manipulated web standards to make sure you can't do that effectively without tons of upfront investment. You pretty much have to run a customized headless browser to get real content. And then you have to figure out how to interact with whatever you get, since increasing number of websites are SPAs. Google itself has it easy, since developers actively modify their site to fit Google's capabilities.

But hey, everything is "fine" as longs as the Web keeps a bunch of developers employed with six-digit salaries. They will put up with any amount of accidental complexity and ignore any effects on future innovation as long as their jobs are secure. (And those jobs are more secure than ever because you need ever increasing number of specialized professionals to keep the increasingly complex technology stacks operational.)

--

* One exception I know of: Web Archive. But their coverage is pretty spotty and they aren't strictly speaking a search engine. Still, it's an awesome effort. At lease someone tries to swim against the tide.

My search engine crawls its own results. The downside is the index is very tiny, under 100,000 pages.