|
|
|
|
|
by zwaps
1553 days ago
|
|
If my model is used to profile a given user such as to maximize revenue from them (my objective is generally increasing with a more accurate classification of a user to the degree that such categories are revenue relevant), does this model still work? If so, how is it privacy compliant, i.e. suffice the intent of the law in say, EU countries, or will not be identified as "privacy theater" in the US? If not, what do you do in these cases? Cool to get your take on this. |
|
The way differential privacy works with machine learning is that it guarantees that one given record cannot have a significant impact on the weights of the models and therefore on its performance. In the particular case of SGD-based models, the guarantee holds for every step of the descent. A good place to start on the topic is Abadi 2016 (https://arxiv.org/pdf/1607.00133.pdf).
What is important in the approach is that we don't need to detect that there is something funny in the loss function of the model. Sarus uses the exact same approach whether the model or the loss function is malevolent or not. The guarantees still hold. This is important because a lot of models can extract personal information even with no intention of doing so and no real way to detect it.
A good way to think about model performance is that we are looking for models that perform well irrespective of one record. If there are many users that have the same pattern of the user you are trying to spy on, the model may still be good but you won't know whether it's because of that user or not.