Hacker News new | ask | show | jobs
by omar_a1 2370 days ago
It's the methods being used to select, populate, label, and validate the training set that are the problem.

Basically, your team is composed of white dudes who don't see the problem with a ML training set consisting largely of pictures of white dudes.

To prevent this they'd have needed to A) Employ a black person, and B) Listen to said employee's feedback, in order to recognize the problem.

Edit: Also worth pointing out, just using a representative population sampling would still show racial bias, essentially weighting accuracy with respect to population percent. You'd probably need to have equal samplings of pictures of people from all races/genders/disabilities if you wanted equal accuracy across the board. That also includes picture quality and range of picture quality. Doubling up images, or using corporate headshot white dudes and grainy selfie People of Color could still cause issues.

Same logic applies to labelling. That minimum wage contracting firm used to decide who's who in the photos may exhibit racial bias, by virtue of the fact that most people do. If their accuracy in labelling is racially biased then so too will the algorithms that it's based on.

In short: Racist garbage in, racist garbage out.

2 comments

Why do you need A to do B? Obviously any competent ML practitioner can notice a biased training set. Discriminating against people based on an assumption that race determines ability is just racism.
Are you asking why you can't listen to an employee that doesn't exist?

But more to your point, if these oh-so-competent ML practitioners were doing their jobs right, we wouldn't be having this discussion.

The whole reason diversity is championed in hiring is precisely because a single individual's perspective can only see so far. And if you have a monoculture team who has experienced very similar life circumstances, you end up with the kind of narrow perspective that leads to more racist soap dispensers.

I think I agree with your specific point concerning facial recognition, but not your general point about un-diverse teams being incapable of delivering a good product for a diverse audience. This is because I have recently worked with a team that spent considerable effort on accessibility issues despite none of the team having any disability.
I'm glad that you were able to deliver a product that helped with accessibility, but part of the considerable effort it took was just in trying to understand what someone else's perspective is. It's less efficient than simply having a member of the team with real lived experience that can answer common-sense questions that your team agonized over answering.

And not to say this happened in your case, but even with that considerable effort, it's still very easy to end up with blind spots in your product that a more diverse team would have caught.

It's the same as hiring for any other level of experience for more routine technical skills. If your team has no experience in this area, they'd need to expend a much greater degree of effort to answer questions that someone who is experienced would already have known the answer to.

Not just employ one, the whole organization has to be more diverse to access their markets which are diverse
I agree. I just have low expectations as far as diversity in tech goes (so, one black person and/or one woman on the team would still be a milestone many of these companies have yet to reach), but you make a good point: hiring only one person of color or one woman would still be very problematic.
What is the ideal number in your opinion, and how do you arrive at it?
Not a number per se, but ideally the racial and gender ratios of the team would reflect the total population of the area (country? Metropolitan area?) in which the company resides, with the standard caveats that random sampling would give a range of different combinations. So that'd be, a multinomial distribution with the probabilities that each race is chosen set at the demographic percentage of the total population.

In other words, ideally the racial and gender distribution of a team would be as inconsequential and unbiased as blood type or handedness, in that the aggregate demographic ratios on your teams would at least match that of the residential population in your area, and ideally that of your broader geographic location.

I'm not doing a good job explaining this clearly, but the simple answer is: more than one. No one wants to be the token hire.

> In other words, ideally the racial and gender distribution of a team would be as inconsequential and unbiased as blood type or handedness, in that the aggregate demographic ratios on your teams would at least match that of the residential population in your area, and ideally that of your broader geographic location.

Ok, I don't know about race, but for gender look up the "gender equality paradox". In countries with greater equality rights for women they show less of an interest in STEM subjects.

https://en.wikipedia.org/wiki/Gender-equality_paradox

Like I say I don't know of any similar studies done for race, but it would indicate that you shouldn't necessarily expect outcomes that "would match that of the residential population in your area, and ideally that of your broader geographic location".

In my opinion we should be pushing for equality of opportunity, not equality of outcome (you appear to want the latter).

That's why I sprinkled the word "ideally" so liberally throughout that comment, as that was the premise of your question. To actually reflect the population, you'd have to somehow address centuries of institutionalized disenfranchisement, de facto segregation, educational barriers, racially-moticated policing and disproportionate conviction rates (highlighted by the OP). That's a lot to put on a hiring manager who isn't even sure that racism is an actual, tangible thing.

Realistically, the most that hiring managers (save for huge FAANG institutions) can do is thoroughly ensure that their team isn't inadvertently (or blatantly) racist/sexist in their hiring process and on the job, and to post the job in enough places that a diverse applicant pool will see the posting.

With that said, hand-waving away that there are few to no women or African Americans/Latinos/Native Americans/etc. on the team with an overzealous application of the Equality Paradox is a pretty dangerous mindset to get into. It's essentially passing the buck, and is eerily reminiscent of the claims made by 1950's Southern US Politicians that Blacks were the ones who were self-segregating because they wanted to, not the other way around.

What I'm saying is, the ideal 50/50 gender ratio/representative race may be unrealistic for a myriad of reasons, but if you're a 50-person start-up with 2 women, one of whom is HR, and no black people, I'd take a good, hard look at the company culture that's being fostered, and particularly whether turnover for women and People of Color at your company is worse than average.

That's an obvious falsehood. Companies operate successfully in foreign countries all the time.
which doesn’t prove or disprove anything as we are talking about representation in a diverse domestic market