| Jeff was very early on in the "just scale up the big brain" idea, perhaps as early as 2012 (Andrew Ng training networks on 1000s of CPUs). This vision is sort of summarized in https://blog.google/technology/ai/introducing-pathways-next-... and fleshed out more in https://arxiv.org/abs/2203.12533, but he had been internally promoting this idea since before 2016. When I joined Brain in 2016, I had thought the idea of training billion/trillion-parameter sparsely gated mixtures of experts was a huge waste of resources, and that the idea was incredibly naive. But it turns out he was right, and it would take ~6 more years before that was abundantly obvious to the rest of the research community. Here's his scholar page (H index of 94)
https://scholar.google.com/citations?hl=en&user=NMS69lQAAAAJ... As a leader, he also managed the development of TensorFlow and TPU. Consider the context / time frame - the year is 2014/2015 and a lot of academics still don't believe deep learning works. Jeff pivots a >100-person org to go all-in on deep learning, invest in an upgraded version of Theano (TF) and then give it away to the community for free, and develop Google's own training chip to compete with Nvidia. These are highly non-obvious ideas that show much more spine & vision than most tech leaders. Not to mention he designed & coded large parts of TF himself! And before that, he was doing systems engineering on non-ML stuff. It's rare to pivot as a very senior-level engineer to a completely new field and then do what he did. Jeff certainly has made mistakes as a leader (failing to translate Google Brain's numerous fundamental breakthroughs to more ambitious AI products, and consolidating the redundant big model efforts in google research) but I would consider his high level directional bets to be incredibly prescient. |
I wonder if you know any of the history of exactly how TF's predecessor DistBelief came into being, given that this was during Andrew Ng's time at Google - who's idea was it?
The Pathways architecture is very interesting... what is the current status of this project? Is it still going to be a focus after the reorg, or too early to tell ?