| Is the two billion page index open source? I've been thinking a lot about days recently. Seems to me like Pandora's box is open. Google knows where you live, where you eat, what your fetishes are, all of your sexual partners. Facebook knows most of those things to, via different methods. And if you run Windows Microsoft probably has access to most of that as well. Apple will too, because if they don't they won't be able to compete. Tesla, Uber, Waze also have a huge amount of data on your life. Everyone is pushing the envelope on how much data they are collecting, and the companies which collect more data will compete better. As tech gets better we will increasingly be unable to resist sharing our whole lives with the companies who are powering modern living. Even worse, there's a huge monopolization effect to having data. Nobody else has anywhere near as much data as Google. That means nobody else can compete. Nevermind the engineering, your algorithms can be 2x as good but you won't have 0.1% the data as a company with billions of daily users. So Google and Facebook are left untouchable. Microsoft, Apple, and maybe Amazon can get in range. Is there anyone else? We can fight back by giving up the privacy war and blowing the doors open instead. Take your data (as much as you dare) and make it public. Let every startup have access to it. Let every doctor have access to it. Give the small players a fighting chance. That does mean a massive cultural shift. It means your neighbors will be able to look up your salary, your fetishes, your personal affairs. It's a big deal. I don't see any other way out of this though. Surveillance technology is getting better faster than privacy technology, because surveillance tech has the entire tech industry behind it. Smarter phones, smarter TVs, smarter grocery stores, smarter credit cards, smarter shoes... smarter everything. Privacy is melting away and we aren't getting it back. |
A free search API will be fully available probably next week. It's in testing already. It's just a matter of putting the finishing touches on the documentation.
And the crawl- and index-data will be available for download in a few weeks. It's also just a matter of documenting the data-format.
BTW: I disagree with your points about privacy. I see DeuSu as a way of fighting back.