Hacker News new | ask | show | jobs
by DarkStar851 2564 days ago
I don't personally see the ethics issue if they were using publicly available pictures. That would be like someone scraping public FB profiles.

That person posted the image/information willingly knowing they lose all control over how it is used and who it is seen by.

Could definitely cause some potential bias though if your input set isn't filtered for some kind of diversity.

2 comments

That person posted the image/information willingly knowing they lose all control over how it is used and who it is seen by.

This seems like a dangerous precedent. There have been cases where images of recognisable people that were made available with some liberal licence were then used as part of marketing for deeply offensive campaigns or illegal activities, for example.

I don't think it's reasonable to say that anyone who volunteered to let others use their image for general purposes should automatically accept the kind of portrayal that would result in a defamation lawsuit in other contexts. You can call them naive for not anticipating nasty people doing that with their image, but naivety isn't a crime. Meanwhile, being portrayed deliberately and without warning as a child abuser or a supporter of a highly unpopular politician or a drug addict or a terrorism suspect could have profound and immediate consequences for the subject, who obviously didn't intend to consent to that and may have no idea it has been done until the reality catches up to them.

Misrepresentation of the images is something Microsoft has no more power over than Google Image Search. At the very least their dataset here wasn't including names/locations/etc. I don't really see how this is any different from Google using their own data in projects like DeepMind. At least Microsoft admitted the project didn't go as planned and they're shuttering it, and cleaning up their data.
To narrow this further to the science-fiction ethical issue that deepfakes and facial recognition are both forcing to the foreground, try this on for size:

“All humanity has the inalienable right to control how their likeness is transformed by others. Consent must be given freely by either the human or their delegated representative, and no discrimination against refusal to permit transformation, whether by default or by declaration, shall be permissible under law.”

I’m not asking if they gave up copyright on their photos. They did. I’m asking, for example, if it’s ethically appropriate for Microsoft to publish annotated public domain photos without requiring a human ethical review for each use of their dataset. If I wanted to perform a sociological study on that dataset, I’d have to get a review board’s approval. Why is performing a statistical study (literally, machine learning) somehow exempt from that ethical concern?