Hacker News new | ask | show | jobs
by smartician 1692 days ago
It's not much different today. Nowadays you'll also need privacy review, accessibility review, security review, and diversity & inclusion review.
5 comments

That's only for public launches (and I'd add QA review to the list), and I'd argue that each of them are critical.

For serving analytics data internally, you only need privacy review, for obvious reasons.

Security and auditing is built into the tools used for querying and serving any such data internally.

> diversity & inclusion review

Is this tongue-in-cheek, or are you serious? Poe's law and all that.

If you're publishing a dataset in the terabytes it does actually make sense to at least do a pass over it and make sure the data you're using isn't skewed in any undesirable way that would cause problems down the road. For example, if you're releasing 5tb of face photos for training facial recognition nets, it would certainly be a problem if all the faces are white women or asian men - the result would probably be over-fit and not perform as well for people in other categories. It would be correct to call that a diversity/inclusion issue.

Privacy and accessibility reviews serve similar purposes there, you're reducing risk by checking for these various problems and ideally they also spot ways to improve the quality of your outcomes.

It's common in fintech for data/ML models to go through similar overview. If you happen to disenfranchise a set of people because your model said not to lend to them, you risk legal jeopardy.

To clarify, I think it's good that this is a practice.

The whole point of the model is to find who not to lend to. You are always going to exclude people by definition.
There are so many ways you can accidentally systematize racism in software like automated lending.

In the past there were explicitly racist policies like redlining. This leads to a historical data set of loan denials to people in specific racial group. If that group has other traits that correlate to their race, e.g. the neighborhood they live in then you could presumably have a model that doesn't explicitly have race as a feature but uses that historical data and some subset of racially correlated features and as a result disproportionately excludes people of that race.

I am not sure how one would remove all ageism, sexism, racism, classism, title-ism, and so on from lending. The whole concept is about making a prediction about the future with sub optimal information, guessing who will default on a loan and who won't. Same goes with insurance.

I have been pretty tempted to lie about where I live in order to reduce my insurance costs. It would reduce the insurance cost by half. It seems pretty disproportionately harsh that I should get lumped together with the people who simply happen to live around me.

Is it possible to make predictions illegal if they are based on historical data from other than the individual customer?

I should clarify, the point is to not discriminate against a protected class.
Tell that to the legislators and prosecutors who create laws and enforce laws against you.
Yes, but we should exclude people for valid reasons, not for their race.
A review doesn't necessarily mean you need to resolve all diversity/inclusion issues. It can merely require that you identify the issues and understand the risks of not resolving them.
the 5tb was performance data collected from servers
Sounds like the reviewer would glance at it for 5 seconds and say 'ok'
What if some servers were excluded?
Partly tongue-in-cheek. These review processes exist, but whether they're required or not depends on the product area and type of project.
Perhaps Google don't want to be in the news for identifying dark-skinned people as monkeys again?
I can't remember which company it was that launched a camera with face identification features, but that didn't recognize any face that wasn't lilly white like every single engineer that worked at that company. They could have probably benefited from a diversity and inclusion review. Heck, employing a single brown engineer or even QA engineer probably would have been enough to notice that before launch.
> I can't remember which company it was that launched a camera with face identification features, but that didn't recognize any face that wasn't lilly white like every single engineer that worked at that company

It was HP https://www.youtube.com/watch?v=t4DT3tQqgRM

People jump to the racism conclusion here, but it's really just a contrast issue.

Detecting eyes, for example is simply easier with lighter skin.

Light skinned black people read just fine, and super tanned white people are harder to read. It's literally contrast (light) detection, not racism.

But because the media keeps everybody primed for racism to stave off the necessary class power rebalancing, everyone jumps to racism.

It may be unintentional, but it shows that they didn't test with anyone with a darker skin tone, which shows the biases at work.
It doesn't show that. It's literally numerical in that dark skin reflects less light than light skin , so the sensors report lower values for the entire face, reducing contrast for the entire face, which is what the recognition systems count on.

Brown eyebrows on brown skin = low contrast.

Brown eyebrows on pale skin = high contrast.

If our races were dark purple hair on bright green skin and bright green hair on dark purple skin, facial recognition systems would have no trouble with either. But that's not how humans render, so our contrast based systems struggle with low contrast.

It's like you're confusing a software/data problem with a photon/physics problem because you're thinking in your box.

Contrast may be one root of the technical problem, but claiming a product "ready to launch" while it fails to work for people based on their race (especially when the company clearly didn't put effort into preventing the issue ahead of time) is problematic.

By having a diverse team (or making some effort to include diverse opinions) you'd have a chance to discover new ways to detect faces, or new mitigations to the contrast problem.

But claiming a product is ready for release when it excludes people based on race (no matter the technical reason) is a problem.

It’s only a “contrast issue” if the people building the system failed to have roughly half of humanity represented in any meaningful way on their dev team.
There is really no reason why the dev team should include any particular demographic: how are you supposed to have 90 years old people in the team to make sure they are recognized correctly? This is a requirements issue which directly impacts validation/test data collection. If their user base has 50% black people any reasonable protocol will include enough black faces int he test data to detect the problem early on. Ml based systems will always make errors, which errors matter will be defined by market/legal/mission requirements. It may very well be that faces of black people are harder to detect (especially in backlit situations). Should you hold the product because it may not work for everybody? It’s a complex decision. Maybe you can just have a good “face detection failed” flow to handle all the errors (think not only black people but also, tattooed people, etc.).

Arguing that having quotas of that or the other in the dev team will make them more sensitive to diversity issues in general is also unnecessary because everybody is part of some minority in some situation, hence a minimum of education will make anybody understand first hand the value of inclusiveness and diversity.

Btw, the team is using only their faces to test the system they won’t go far.. (think about lighting condition / different environments).

They need to test on a realistic sample of users. Testing on the dev team is just lazy; they probably have unusually new and expensive hardware in well-lit offices.
And yet a short while later, they released a patch that fixed the bug. So your physics claim is irrelevant.

The fact remains that they would not have released that software knowing it wouldn't work for Black people. And yet, they didn't notice the bug because they were making no effort to be inclusive.

Microsoft Kinect also had this issue. I know the one black engineer (that had little to do with the project) that repeatedly got pulled in to see if the test system worked with non-white people.

He left soon after.

I wonder if the company failed to give him credit, responsibility, or compensation commensurate with his value to the project?
We detached this subthread from https://news.ycombinator.com/item?id=29086292.
having launched some product at Google in my day, I know quite well how to skate through that process (although D&I was not part of it when I filled out my forms). Sadly for my friends in privacy and security, it's not hard for product teams to exploit Google's propensity to launch and override privacy and security concerns.
Is the super secret process to just have a vp invested in the launch?
having executive cover is important, but equally important is knowing exactly how to write "this control is out of scope for my project" in the launch forms, or making an approval "FYI" instead of "Required". It gets harder and harder to do this as your launch requires more and more personal data to operate.
diversity & inclusion review
Assuming this is sarcasm, you realize Google has a massive userbase all over the globe from all walks of life, right? Does it make business sense to accidentally exclude certain people? Or ethical sense?
Businesses exclude people all the time. E.g. many videos are geoblocked, and there's no way to view or purchase them in some countries.

Here are some other examples: I can use free version of Google Colab from Ukraine, but I can't pay for Pro version. (I can pay for Google Cloud, though.)

OpenAI blocks API dashboard access to IP addresses from Ukraine. (But it is OK if I use VPN LOL.)

So it seems blocking ppl is the norm. I guess "diversity and inclusion" is mostly about social topics within US, not about not excluding people.

In general it's about not accidentally excluding people. All the cases you propose are deliberate blocks for various (mostly legal) reasons. The deliberate blocks are considered in the review, and as long as there is a sound business case for launching with the exclusion, it goes ahead.
You're running into US sanctions issues (Crimea), not woke Google policy.
Doesn't matter. Also sanctions are not against Ukraine, that would be stupid.
> Also sanctions are not against Ukraine, that would be stupid.

The sanctions explicitly include Ukraine, due to financial entanglements between Ukrainian and Russian corporations[1].

[1]: https://www.state.gov/ukraine-and-russia-sanctions/

Cry in Haiti
I don't understand this line of reasoning since it assumes inclusion training actually promotes inclusion. My experience has been that it usually means racial/gender intersectionalism training that everyone gets to swallow regardless of culture or belief because it's what white people in the us tech industry are passionate about right now.
Yes.

The expectation isn't that you actually adopt or accept the values. The expectation is that you know that if you fail to do so (and you lack sufficient privilege in your organization), then you will be held accountable.

Practically speaking "woke" people would prefer to work with people who share values, but most of us will settle for people who can at least emulate a decent human being while interacting with other people at work.

Being "woke" goes beyond just being a decent person though, because most people's metric for decency is interpersonal decency. My understanding is that the sociological concepts that go into "wokeness" include intersectional analysis, microaggression theory, critical theory, 3rd wave feminism, gender theory etc. I think these ideas are mostly good (with the exception of microaggression theory), but they go way beyond "just be decent to other people" and into the territory of deep academic and systemic mindsets that are far from the default in the individualist West (and especially the US). I mean damn, half these ideas are French, and France is pretty culturally different from the US, French academia even moreso.

For example: not being racist on an individual level is pretty intuitive and obvious to most people, and mostly comes down to being a decent human being. Being institutionally anti-racist is a totally different thing, and way more involving, because you're not just not being a dick to people of a different race; you're trying to counteract systemic disadvantages.

It also presupposes such systemic disadvantages exist. Not sure why so many people from other countries immigrate to places that are so obviously systemically biased against them.

Or why when institutions such as Harvard actually do systemically discriminate against Asians it’s routinely ignored by the woke crowd.

Can anyone explain to me why Asians despite having some of the highest scores and GPAs have the lowest rate of admissions to some of the wokest institutions in America?

Why is the difference in incarceration rate between men and women or the police shooting rate not presented as systemic discrimination?

I agree that at one point maybe being oblivious to systemic problems could go along with being decent. But these days I don't see how being a decent human is compatible with either, "I don't want to learn whether you're getting the short end of the stick" or "I know you're getting the short end of the stick but I'll never do anything about it": neither seem decent to me.

I'll also note that although those particular theoretical frameworks were originally popular ways to understand certain problems, there are plenty of other ways to understanding.

As an example, let's take the microagression where white people want to touch black hair. This is a common problem [1][2][3], and one certainly can situate it within a whole host of racist microaggressions and a broader theoretical framework. But one can also just say, "Dude, black people are not pets. Keep your hands to yourself." Or in the middle, the handsy person can listen to black voices on this and get a personal understanding of why it's a demeaning thing to ask/do. That doesn't require any theory, just the sort of empathy and respect that is at the core of human decency.

[1] https://www.forbes.com/sites/janicegassam/2020/01/08/stop-as...

[2] https://www.ft.com/content/b5c3fa4e-e6c0-11e9-9743-db5a37048...

[3] http://www.cnn.com/2011/LIVING/07/25/touching.natural.black....

how do you “counteract systemic disadvantages” without simply disadvantaging all white peoples ( that would be racist against white people)

Or by doing simply giving extra benefit (affirmative action) to one group ?

As we have seen with affirmative action it put people from china India and japan in the same bucket and give them less preferential treatment compared to African Americans. So it just seem that the minority which speak louder about injustice is the one that get the most benefit.

I agree that systemic racism is a thing but I have never seen a single proposed solution which is not simply “reverse racism” or positive racism.

We should be able to give equal opportunity to all group without explicitly helping one group or disadvantaging one group!

> For example: not being racist on an individual level is pretty intuitive and obvious to most people, and mostly comes down to being a decent human being. Being institutionally anti-racist is a totally different thing, and way more involving, because you're not just not being a dick to people of a different race; you're trying to counteract systemic disadvantages.

Sincerely acknowledging this may be confusing: it’s the equivalent of recognizing that you’re being graded or compensated fairly while you see someone else not being treated that fairly… and then not shrugging it off.

It’s not a deep philosophical concept. It’s living in a society with responsibility to everyone else in your society.

I mean yeah if your culture or belief involves not treating people of different races or genders equitably, then the goal of the training isn't to change your mind. Swallow, follow, or get out of the way.
>accidentally exclude certain people?

e.g how? could you provide some examples e.g two?

there's a lot of talk about this stuff when it comes to MAGMA, yet docs still use some auto-generated translations which suck.

It seems like this kind of problems occur mostly within some specific areas, meanwhile OP seems to suggest that this kind of review should be applied for everything.
From a practical business perspective, performing a diversity and inclusiveness review is a risk management activity.

It doesn't really matter if the business strongly supports or opposes a particular set of diversity and inclusiveness goals from a fiduciary perspective, but it sure does matter if the business keeps losing money or missing targets because it is embroiled in scandals, paying out settlements to staff that have suffered discrimination, or being hauled in front of regulators to air their dirty laundry.

One would hope that being a decent place to work, and treating people fairly would be enough of an incentive, but for everyone else, there are risk management processes designed to have repeatable processes to identify business risks, escalate them to leadership, and presumably either accept the risk, or steer the project towards a solution that has a more acceptable risk profile.

Not every kind of review is applicable for every single launch, but diversity and inclusion is applicable to more than just AI (in general, I don't know what the review process or requirements are for D&I)
Ah I know. Let's have a review to see if we need a D&I review!!
> could you provide some examples e.g two?

There was that time when Google Photos started labeling black people as gorillas[1] in uploaded pictures. I suspect the training data for their classifiers "accidentally exclude[d] certain people": diversity & inclusion review would have avoided that kerfuffle.

1. https://www.usatoday.com/story/tech/2015/07/01/google-apolog...

By the same logic we can justify any [social issue] division. The sad thing is that the rules are arbitrary and do not help in solving the issue. Actually it is in the interest of the division to create or exaggerate problems to justify its existence.
Slightly OT, but a lot of products that are launched in multiple regions - Google included - exclude people who live in a country but don't speak the native language.

I work for a company from an English speaking country, and every time I need to reauthenticate with my Google account, it defaults to the native language of the country I'm in. They do have an option to change the language (in native language), but it's weird it defaults to that given I was last logged in with an account that is set to "English (US)" and my computer is set to the same.

Recently a large clothing retailer launched that is available in many other European countries, but it's only possible to use the native language here. It's even the same app, they just see your account is set to this country and only lets you view in that language.

I agree with you but it sometimes seems like Google doesn't care at all about it when they have the kind of customer support processes that they have.
Customer support is after the fact, reviews are before the fact. It's very cheap to do these reviews before launch and then you can point at those to say "we're trying!" while not providing any customer support.
You think? Who is doing do diversity and inclusion reviews? Do you think they're getting paid call center wages?
The customer base is larger than the # of projects to review by many orders of magnitude. So yes, I think internal review will cost less when a single reviewed project/product might have millions of users.
Does it make sense to serve a dataset without approval that it's inclusive enough? Yes, because that's typically how things in the world work.
I don't understand this argument. It's okay for things to be a certain way, because things are typically that way?

Apart from the circular reasoning, the practical impact is that you should also drop privacy review because corporations steal your data, security review because everyone gets hacked, readability review because there's a lot of legacy code, etc.

I miss the internet where people just created what they wanted and organically found users
That’s not what D(E)I refers to.
Nothing is all inclusive. Nothing.
Is your argument here supposed to be "Nothing is all inclusive, therefore we shouldn't even bother trying"? If so, I'd argue that's a lot more ridiculous than a review process designed to help catch major inclusivity issues before they become problems.
Sure, but that's not a reason to not even ask the question. Maybe not every DI initiative turns out to be helpful or productive, but as someone who's privileged on pretty much every axis there is, I'd be grateful for the kind of internal support system that could give me an early warning sign for "hey, this design decision that made sense to you and your team has the potential to alienate user base X and there's a real possibility that if we launch in this state it's going to explode into a minor Twitter scandal."
Isn't this just called user testing? Also this is in the context of a fucking dataset. If data needs to go through DI in case something blows up on Twitter, I guess it's sad state we're in.
Does it? Seems to me data is a prime place for exclusion to occur. Example: a dataset of tagged photos for training a neural net to analyze facial expressions. All the photos are of white faces.
If, for example, the dataset only contains white faces and is intended to train facial recognition then yes, it needs to go through some kind of DI review.
no code is 100% perfect, yet people still do code review and the point of CR is not to make the code 100% perfect.
Perfect is the enemy of good.
Death and taxes.
Science is always wrong. Always.
Google can talk when they stop using a license by a domain squatting org who revised their history and has a pretty offensive line on their front page. COMMUNITY-LED DEVELOPMENT "THE APACHE WAY indeed. Worse, most of the links on Google search point to the org and not the actual tribes.
Everyone knows the org is called Apache because they Jump on it! Jump on it! and not because they're appropriating Native Americans.