|
|
|
|
|
by koutetsu
810 days ago
|
|
Let me quote from the article: > Lavender learns to identify characteristics of known Hamas and PIJ operatives, whose information was fed to the machine as training data, and then to locate these same characteristics — also called “features” — among the general population, the sources explained. An individual found to have several different incriminating features will reach a high rating, and thus automatically becomes a potential target for assassination. It literally says that they use data from known Hamas members (we don't know what this data contains) as training data which is a recipe for making biased predictions. Hamas members represent a minority in Gaza (the total population is over 2 million people) and thus the real data is heavily imbalanced[0] and unless addressed leads to bad models. On top of that, if you know anything about Machine Learning then you should be aware of models finding spurious correlations[1] in the data that make its predictions accurate on the available training and validation data and not so much once deployed and used with real data. [0] https://developers.google.com/machine-learning/data-prep/con... [1] https://thegradient.pub/shortcuts-neural-networks-love-to-ch... |
|
If the features are things like “wears a scarf” or “has a beard” then I agree unintended bias is likely a problem. But given we don’t know. How can we comment?