Hacker News new | ask | show | jobs
by anigbrowl 3360 days ago
Why would we want to reproduce existing structures of oppression in mechanical form? Have you noticed how automation often vastly amplifies things? It's a short step from saying 'this model accurately reflects the bias in society' to 'that's how things are, the computer says women aren't cut out to be doctors.' Surely you are aware that in real world world people rationalize decisions they don't actually understand all the time because they are not capable of or interested in improving upon the system within which they pursue their own economic interest on behalf of others whose interests do not seem coincident with their own.
3 comments

> Why would we want to reproduce existing structures of oppression in mechanical form?

If (for example) 66% of Doctors are male and 34% female then it's not reproducing "existing structures of oppression" it's inferring something about reality.

In an environment in which Blue people are banned from becoming doctors, its also inferring something about reality to conclude that 0% of Doctors are Blue. It would be entirely wrong, however, to use these inputs to infer anything whatsoever about the respective propensity of Blue and Green people to become doctors in an environment in which such a rule or idea of a rule had never existed. Obviously "structures of oppression" - real and imagined - which lead to fewer female doctors even in western liberal democracies where women wishing to become doctors are generally met with encouragement are less extreme, but that isn't to say they don't exist or that a computer output (or human interpretation of said computer output) is likely to draw correct inferences from it.

And if you think that people won't use the idea that the outputs are unbiased because the computer isn't programmed with the same prejudices that produce the inputs, I have some algorithmically-generated investment advice involving a bridge to sell you

> It would be entirely wrong, however, to use these inputs to infer anything whatsoever about the respective propensity of Blue and Green people to become doctors in an environment in which such a rule or idea of a rule had never existed.

That's fine but it isn't the goal of these algorithms. It isn't the reality that is useful for them to learn. It's a different problem to try to build some kind of "unbiased" ontology rather than just to learn about words. Feel free to research or create solutions to this other problem, it sounds interesting.

If it’s saying that and only that then that’s obviously fine. However, if that knowledge is then applied in any other way, then that’s problematic.
It's inferring something about reality, but what?

Suppose, for example, that I gave this same statistic to someone and then asked them to select from a pool of 100 applicants for 50 available places in medical school. Let's assume that there's an equal # of male and female applicants and that their exam results are all similar. Do you think that knowing about this 66-34 split might influence the gender balance of the final selection?

Knowing about the gender balance wouldn't influence the final selection if you programmed the selection criteria not to be influenced by the gender balance.

The whole point of training and using machines is to make more accurate, more useful decisions in a complex world.

That can't happen if we give them data that isn't borne out by reality, or tell them to ignore data that is.

If you have to change the terms of the question to give an answer, then I think I've made my point.
>structures of oppression

What oppression? How are word vectors oppressing anyone? What a ridiculous claim.

>Have you noticed how automation often vastly amplifies things?

No, not at all. I've heard this claim on similar discussions. But I've yet to see a convincing example. Particularly with word2vec. I find it very implausible that word vectors will somehow discriminate against female doctors or whatever.

>It's a short step from saying 'this model accurately reflects the bias in society' to 'that's how things are, the computer says women aren't cut out to be doctors.'

No it's not a short step at all. No one is ever going to use word vectors to figure out what genders are capable of what jobs. At worst, your auto-correct might be slightly less likely to suggest "doctor" for a misspelled word occurring in a female context. And on net it will still make more accurate corrections than the alternative.

No one is ever going to use word vectors to figure out what genders are capable of what jobs.

Directly, no. Nobody is going to go 'ah, word2vec - a new tool with which to perpetuate patriarchal capitalism, mwuhahaha'...probably. People are weird that way.

But indirectly they certainly will. How about NPC character generators in MMORPGs? Or chatbots on social networks? Stock characters in auto-generated romance novels? The possibilities are endless.

No doubt you will these examples are ridiculous, because you seem like a rigorous scientifically minded person who would be careful not to use data in inappropriate contexts, and who would try to discount cultural or emotional factors in making strategic decisions. But you are only as good at this as your own self-awareness and willingness to acknowledge the existence of implicit bias.

And many people are quite different from you and more easily or willingly allow their judgment to be shaped by representational stereotypes. Marketing people aim to confirm their audience's worldview very closely so that consumers will be willing to identify with the commercial prompt when it arrives. Politicians and yellow journalists routinely abuse statistics to grab people's attention. And so on.

I urge you to think more about this, and in more imaginative fashion. People are often surprised by the unexpected applications of technology employed by others.

>No one is ever going to use word vectors to figure out what genders are capable of what jobs.

How can you possibly make this claim?

Biased word embeddings have the potential to bias inference in downstream systems (whether it's another layer in a deep neural network or some other ML model).

It is not clear how to disentangle (probably) undesirable biases like these from distributed representations like word vectors.

Because you cannot change reality if you do not first acknowledge what it is. First off, this is an analysis tool. If we warp our analysis tools to pretend that e.g. no gender biases exist in places where they do, then we are not making the world a better place, we are just removing our ability to quantify the ways in which it is not.