Likely to avoid their mistake with MapReduce, where by around 2011 candidates were coming in to interviews and saying "MapReduce? That's sorta like Hadoop, right?"
There's value in controlling mindshare; keep everything proprietary too long, and people just use open-source clones that may be inferior but can actually be used by the majority of the talent pool.
EMR beat Google Cloud MapReduce to market, but you're forgetting that before there was such a thing as cloud services, we relied on open-source frameworks and setup our own clusters. EMR is based on an open-source framework called Hadoop, which itself was built on a closed-source Google framework called MapReduce that Google released a paper about. MapReduce came out in 2003, Hadoop in 2006, Amazon EMR in 2009, and Cloud MapReduce in 2015.
...which is sorta my point. People remember the version of the technology that makes it accessible to them, not the first one that comes out. When Google keeps thing proprietary forever and only releases academic papers, people quickly forget just how far ahead they were.
That's all true, but what may matter more to Google was the missed business opportunity of being first to market with a relatively easy distributed computing paradigm.
That's exactly backwards - the MapReduce paper was intentionally released as vaporware to make the rest of the industry spin its gears trying to replicate an imaginary result. And that's why we have Hadoop.
You realize you're arguing with an ex-Googler who has worked on production MapReduces that were first written around 2005 and has read the initial MapReduce commit?
I thought the MR paper described an actual working implementation. It had performance test results, descriptions of issues they encountered and solved, and some sample source code of how MR is used. It seems like a lot of effort was put in for it to be a hoax.
I imagine part of it is that businesses built on Tensorflow play nice with Google Cloud at their TPUs, but mostly I suspect it's just a mindshare thing. If Google becomes the place that all the top data scientists want to work – such that they don't even have to be poached – that's a Very Good Thing for them. It probably doesn't hurt if those data scientists come in already familiar with a tool Google uses internally.
Kind of reminds me of the genius move by Tesla to crowdsource collection of self-driving car information. Experts want to get where they have the data to train their models, and if Tesla propels itself ahead of the pack for number of miles of real-world training data, then that makes them very attractive to talent.
If all machine learning experts use TensorFlow, all the machine learning chips coming out will be highly optimized for TensorFlow. Higher competition among TensorFlow chips = better acquisition prices for Google. They also don't have to go around convincing chip makers to support TensorFlow (like they did, for instance, with the VP8/VP9 codec).
There's value in controlling mindshare; keep everything proprietary too long, and people just use open-source clones that may be inferior but can actually be used by the majority of the talent pool.