I'd assume for machine learning developments, specifically as training data and back testing. Build a new model with 3x the data you had before or be able to retrospectively see how a model would have performed over 3 years rather than 1.
Don't forget that they're also entirely complicit to NSA demands for live access to data as well, per the somewhat-recent leaks. That's another level of evil above regular profit motivations.
Even if they weren't (mainly) an advertising company, even if they charged for all the free things, they'd still need all the data they suck in to provide services they provide.
That's simply not true. They collect a ton of data which they wouldn't have to collect, if they wouldn't run advertisements or could even just delete this data much earlier.
There's even crap like Android not allowing you to selectively turn off the Internet Permission for apps, for which there is no good reason other than Google needing an internet connection to display their ads.
It's not true that machine learning and AI needs a lot of data to do what it does with Google's services? I guess you know something their engineers don't, so I'm looking forward to Sylos' Dataless AI & Co. I bet a lot of us would even pay good money for such a thing.
Sigh, I'm not arguing that they don't collect for advertising, I'm not even defending them at all. It's just that the things they do are impossible to do without a huge amount of data, regardless of advertising. Again, if you're so sure that it is “simply not true”, here we are, YC is your oyster. Or any other tech VC fund for that matter.