| HN Mirror

Just because you shuffle the examples on a single phone/user doesn't make it stochastic.

The entire point of using stochasticity (ie: random shuffling) is to avoid similar and/or a same-ordered run of examples from redirecting the hill climbing in a globally non-optimal direction all at once.

A single user's examples will be very similar, so you can shuffle all the examples from one user you want - that doesn't make it truly stochastic in the context of gradient descent optimization.

The quantization / compression part is pretty cool though. I suppose that could obfuscate slightly what the original example was for privacy purposes? Seems like you'd lose on accuracy though.