Hacker News new | ask | show | jobs
by cleverwebble 1248 days ago
I'm wondering what % of forms that ask for gender actually use that field in any way
3 comments

Knowing the gender is necessary to produce grammatical sentences in many languages.
That would be pronouns, not gender, right?
Kinda, but also no. It depends on a language. For example in Polish you'll say "has done" as "zrobił" (he), "zrobiła" (she), "zrobiło" (baby, or it). You don't get to play fancy with pronouns there, because the verb itself is modified. (Also to make things fun, objects are gendered too, so when speaking you need to know that the fridge is female and a table is male)
No becauase in some language the forms of words used to describe people vary depending on whether the person is male or female. It's not just a case of picking the right pronouns, it's the gender agreement of the other parts of speech too.
Do you have an example? Most languages I've seen that do this seem to be deriving gendered word variants from the pronoun, at least as far as I can tell with my very limited experience.

Spanish has a lot of word variants based on feminine/masculine/plural subjects, but it's not really based on gender (otherwise inanimate objects wouldn't have those variants); as far as I can tell with my limited Spanish experience they're based on word agreement in the sentence with the pronouns or at most with masculine/feminine presentation.

Is there ever a scenario where, "el gata" would be correct in Spanish? "Gato" agrees with the "el" pronoun; it's not based on the gender identity of the noun independently of the pronoun, is it? Are there other languages that work differently?

This also seems like a problem that doesn't really require knowledge of gender identity as much as "do you want us to use masculine/feminine variants of words when referring to you?" -- something that seems easy enough to guess based on the pronoun or (when loading up a language translation that needs more advanced logic) to just outright ask the user.

I kind of hate the software trend towards "we need to derive everything we're doing from first principles"; I feel like a lot of these problems could be solved by saying, "when we encounter an edge case we'll ask what to do, rather than doing data collection up-front that will be irrelevant for the majority of users."

> Do you have an example? Most languages I've seen that do this seem to be deriving gendered word variants from the pronoun, at least as far as I can tell with my very limited experience.

That's a weird way to put it. Words have grammatical gender, in some languages like Spanish there are articles like "el" or "la" that go with that, but in other languages like Russian there's no article.

For things in Spanish the grammatical gender is fixed. Eg, a window is always a feminine word. For cats it of course depends on the cat.

You're not matching the word to the article, but the other way around. "ventana" is a feminine word, so there's always a "la" before it.

> Is there ever a scenario where, "el gata" would be correct in Spanish?

Not in that specific case, but there are rare nouns where both are valid, eg "el mar" and "la mar", and sometimes with a subtly different meaning used for poetic effect.

So genuine question, because I am in the middle of building a dialog system that I'd like it to work with multiple languages -- it sounds like in the worst-case scenario we can encompass all of that behavior by just having a toggle in settings next to the pronouns for specifically those languages: "use masculine word variants / use feminine word variants".

If even that; if I know someone uses "el", then "el mar" isn't a problem, and I know to use masculine word variants in other locations. Is there any scenario where knowing that a user/player uses "el" to refer to themselves wouldn't allow you to derive what gender-variant of another word to use when referring to them?

I guess if someone is using completely agender pronouns (I don't know what that would be in Spanish) I'd need to ask about feminine/masculine word variants, but I'm still struggling to see why I need to know their actual gender.

I think you're confused. In Spanish and other languages, both pronouns and other words depend on grammatical gender. Saying that you can “guess based on the pronoun” doesn't make sense, because the pronoun depends on gender too.

It's true that inanimate objects have kind of arbitrary genders, but for specific people, they're based on their actual gender.

> because the pronoun depends on gender too.

But you can ask the pronoun. If you know the pronoun, you know what variants of words to use, don't you? You don't need to know if the person is transgender or what their gender identity is, if they use `el`, you use masculine word variants to refer to them.

I'm not sure what I'm missing here; the only reason why knowing the gender identity would matter is if gendered word variants are allowed to mismatch pronouns in the language.

And even in that case, does the specific gender identity matter, or do we really only need to know whether someone wants to use masculine/feminine word variants when referring to them?

The easy example in English(though obviously borrowed) is the difference between fiance and fiancée. Granted, this is a very domain-specific case, but points out that yes, gender can modify words in English other than pronouns and titles.
Not just pronouns, but nouns and adjectives too, and in some languages verbs.
Technically yes, but for the vast majority of English speakers, pronouns supervene on gender.
Which means gender should just be 3 choices, he, she, they. That's it. if a singular pronoun is used, its just changed to, "The person".
Or in any meaningful way.
Unless you actually need to use gender for something useful I'm guessing it is a GDPR violation to ask for it.
I'm not sure why this was voted down?

This is Personal Information and so the data processor must have a specific reason why they need it, and must take appropriate steps to secure it. They can't just capture it "In case we need it later" nor can they just store it haphazardly because it's not valuable to them, or refuse to update it because that's difficult.

Violates the principle of data minimization, but I'm not sure if its on its own a violation that can result in an enforcement action.
In german, grammatical genders matter a lot. And, well, that extends to people as well.
I might be wrong but I don't think GDPR limits what information you can request, it's more about how the information is handled and the need for consent to collect it.
From typical GDPR guidance:

"If you can reasonably achieve the same purpose without the processing, you won’t have a lawful basis".

The implication of your reasoning means that pronouns shouldn't be used and in a lot of cases names shouldn't be used because there are other ways to address the user.
GDPR requires a purpose for processing. The majority of other requirements GDPR imposes attach to the purpose, rather than the processing activity.

GDPR gives a lot of leeway in determining a purpose. But if you don't have a purpose, then the processing is unlawful regardless of literally anything else you've done or not done. Not even with valid consent of the data subject (because, guess what, consent attaches to the purpose).

So if you say "we need this data for addressing communications to the data subject," that's a purpose. On the other hand, if sex gets stored in a DB column and never used, that's a violation.

Separately, GDPR has a Data Minimization requirement: you collect data for a purpose, could you achieve that purpose with less data? This one has some flex to it. If the answer is "we could but not as well," then the data has a purpose. Maybe not a great purpose, but it's something.

I am not sure why you're explaining GDPR to me.