| It is my position that search can never be decoupled from the browser and when I say "search" in the statement you are referring to I mean Google, as it is Peerless for english lang search. Search is in fact massively expanding as tooling and machine learning capabilities increase do to research and hardware. Similarly, Google, Apache and Elastic have many open source libraries for search, indexing, storage, caching and serving which allow for scalable architecture. Also, outside of the things above like crawlers and Hadoop, Solr, etc. Microsoft and Google have open sourced JS parsing engines and Node as well as the Electron browser, Brave Browser and Node Web-Kit are built on technology that leverages this. So, as someone who is not an information architect or data scientist, it seems like we have an ecosystem where a scaled down version of google can be built and trained on the per user basis and completely private. The solution I have hashed out in more detailed elsewhere, but on to the actual question, is search deteriorating? * My results seem to be worse and I have much less control than before. Anecdotally, it seems as if qoutes and boolean ops are respected less. * Discovery is a huge issue that Google solved well, now we have the opposite conditions but the same problem. There were very few sites and it was hard to know what content was on them. Now there is too much content. * Without fine grained control over my search I can't get make destinction between Information vs. Links. This is needing a date or well accepted piece of content/documentation vs. finding some new apps or non-facts. DuckDuckGo is quite good for some things and Google is good for others. Sometimes you may want to eliminate all wordpress sites (many content mills built on this) or remove Alexa links from your queries if you need to discover something. * Time is bad. E.g. I have a problem with JavaScript function. Get back results from 3 years ago. This is amazing and difficult to do, so commendable but I need newer info as pace changes. E.g. News. * Need to eliminate sites and content I don't want. NOT something like a content filter for porn or whatever, something like: never return results from %news-websites older than 30 days.
never return content posted %before nov-2014
remove links from [%Alexa-1000, %Wordpress, #TLD(.co,.co.uk)] for reputation ranking
decrease links from [%Alexa-1000, %Wordpress, #TLD(.co,.co.uk)] by [80%] for reputation rankings
There are other things but so far my point has been:* Google provides no versatile results. * Many pieces of well tested software would make it easy(for the right group of software engineers) to silo crawl data and parse it with a users own parameters. There is a way to set up this ecosystem that I have been thinking about, but to conclude: Google is fucking awesome and really really good at what they do. Search experience is getting worse in terms of control but tooling is leagues better. Google sees this and is working on loftier goals internally (I imagine), thus it has split up into a meta-company that will work as an accelerator for growth while capitalizing on some verticals like the Real Estate thing they are doing or Delivery they just announced to keep short term profitable before they can achieve their end goal. Also, advertisement is an unsustainable paradigm for internet growth for many reasons. Notes: The DOM is super fucking horrible. The Parsing engine is a great fix for a fucking horrid DOM. DNS security is fucking horrible. The Next google will be a browser & an optimization marketplace. I don't think compiling to web assembly makes sense but I could be totally wrong. I think something like Docker would provide a sandbox that would let people get performance and versatility and sidestep the entire DOM, only run JS, need Apps vs. Content thing. No idea how this works on mobile though. |
> So, as someone who is not an information architect or data scientist, it seems like we have an ecosystem where a scaled down version of google can be built and trained on the per user basis and completely private.
I'm skeptical. Search is a huge problem just because of the bizarre amount of resources you need to throw at it. I can't afford to build my own datacenter(s) to host my custom search system. There might be huge advances ahead in terms of storage capacity on commodity systems, I don't know, but in any case, I'm only one person crawling webpages versus millions of people (and bots!) creating them.
You implicitly address that a bit later by talking about "silo crawling", but again, I'm skeptical. The only silo structure that I can easily see is large sites with useful content like Wikipedia or StackOverflow/StackExchange, but I'm likely to come across these anyway in any given domain, and I can easily filter for these on Google today, e.g. "site:en.wikipedia.org". The more interesting and hard part is the long tail of small, sparsely interconnected websites which might contain unusual insights but are unlikely to come across with a silo crawler (or with Google's current UI, for that matter).
> Search experience is getting worse in terms of control
I guess that's the classical problem of scaling a product to a large audience of mostly technically illiterate users. Maybe Google is learning from Apple, whose UIs have for a long time favored ease of use over giving control to the user.