I was referring, in a convolute way, to the fact that all the data collected have been/will be used to train the models that will allow offline voice recognition (like the one google has shown at i/o last week).
I might be mistaken, but the reason we don't see offline recognition (amongst other things) is hardware limitations, not the lack of training data. The small onboard chip doesn't have that much compute power, so they offload to more powerful Amazon/Google servers that can run the inference.
I think that this is an important point. Obviously there's more computing power available in Apple/Google/whoever's data centres than on my device, and I'm sure that is, or at least was, a concern; but I also don't believe that they are indifferent to the utility of sitting on such a huge volume of user-submitted, real-world data.