Hacker News new | ask | show | jobs
by xigoi 1258 days ago
Knowing the gender is necessary to produce grammatical sentences in many languages.
2 comments

That would be pronouns, not gender, right?
Kinda, but also no. It depends on a language. For example in Polish you'll say "has done" as "zrobił" (he), "zrobiła" (she), "zrobiło" (baby, or it). You don't get to play fancy with pronouns there, because the verb itself is modified. (Also to make things fun, objects are gendered too, so when speaking you need to know that the fridge is female and a table is male)
No becauase in some language the forms of words used to describe people vary depending on whether the person is male or female. It's not just a case of picking the right pronouns, it's the gender agreement of the other parts of speech too.
Do you have an example? Most languages I've seen that do this seem to be deriving gendered word variants from the pronoun, at least as far as I can tell with my very limited experience.

Spanish has a lot of word variants based on feminine/masculine/plural subjects, but it's not really based on gender (otherwise inanimate objects wouldn't have those variants); as far as I can tell with my limited Spanish experience they're based on word agreement in the sentence with the pronouns or at most with masculine/feminine presentation.

Is there ever a scenario where, "el gata" would be correct in Spanish? "Gato" agrees with the "el" pronoun; it's not based on the gender identity of the noun independently of the pronoun, is it? Are there other languages that work differently?

This also seems like a problem that doesn't really require knowledge of gender identity as much as "do you want us to use masculine/feminine variants of words when referring to you?" -- something that seems easy enough to guess based on the pronoun or (when loading up a language translation that needs more advanced logic) to just outright ask the user.

I kind of hate the software trend towards "we need to derive everything we're doing from first principles"; I feel like a lot of these problems could be solved by saying, "when we encounter an edge case we'll ask what to do, rather than doing data collection up-front that will be irrelevant for the majority of users."

> Do you have an example? Most languages I've seen that do this seem to be deriving gendered word variants from the pronoun, at least as far as I can tell with my very limited experience.

That's a weird way to put it. Words have grammatical gender, in some languages like Spanish there are articles like "el" or "la" that go with that, but in other languages like Russian there's no article.

For things in Spanish the grammatical gender is fixed. Eg, a window is always a feminine word. For cats it of course depends on the cat.

You're not matching the word to the article, but the other way around. "ventana" is a feminine word, so there's always a "la" before it.

> Is there ever a scenario where, "el gata" would be correct in Spanish?

Not in that specific case, but there are rare nouns where both are valid, eg "el mar" and "la mar", and sometimes with a subtly different meaning used for poetic effect.

So genuine question, because I am in the middle of building a dialog system that I'd like it to work with multiple languages -- it sounds like in the worst-case scenario we can encompass all of that behavior by just having a toggle in settings next to the pronouns for specifically those languages: "use masculine word variants / use feminine word variants".

If even that; if I know someone uses "el", then "el mar" isn't a problem, and I know to use masculine word variants in other locations. Is there any scenario where knowing that a user/player uses "el" to refer to themselves wouldn't allow you to derive what gender-variant of another word to use when referring to them?

I guess if someone is using completely agender pronouns (I don't know what that would be in Spanish) I'd need to ask about feminine/masculine word variants, but I'm still struggling to see why I need to know their actual gender.

Translation is actually a very tough problem, especially in games.

Take a sentence like "$PERSON picked up $ITEM".

Russian requires knowing the gender of $PERSON because the verb "to pick" is modified depending on the gender of the person. It also needs the accusative declension of the $ITEM. In Russian you don't just say "book" in every context possible like in English, the word "book" gets different endings depending on the context it's used. A bit like verbs vary in English: become, became.

You also need to be very careful with things like word play -- it just doesn't translate right. Eg, there's a point in Monkey Island where an actual monkey is used as a wrench, because "monkey wrench". That just doesn't translate, at all.

And culture. Eg, things like honorifics and the general way people talk may not necessarily translate. For instance apparently the famous Star Wars "Do not want" happened because in Mandarin shouting just "NO!" isn't a thing.

Point being, no, you can't translate simply and naively. Translating something like a game is a very serious job where you should actually talk to translators in advance if possible to figure out whether your wanted design is going to be a huge pain or not, and if something might not translate at all.

I think you're confused. In Spanish and other languages, both pronouns and other words depend on grammatical gender. Saying that you can “guess based on the pronoun” doesn't make sense, because the pronoun depends on gender too.

It's true that inanimate objects have kind of arbitrary genders, but for specific people, they're based on their actual gender.

> because the pronoun depends on gender too.

But you can ask the pronoun. If you know the pronoun, you know what variants of words to use, don't you? You don't need to know if the person is transgender or what their gender identity is, if they use `el`, you use masculine word variants to refer to them.

I'm not sure what I'm missing here; the only reason why knowing the gender identity would matter is if gendered word variants are allowed to mismatch pronouns in the language.

And even in that case, does the specific gender identity matter, or do we really only need to know whether someone wants to use masculine/feminine word variants when referring to them?

Well, you could ask about the pronoun and use it to determine gender, but how is that different from asking for gender directly?
The easy example in English(though obviously borrowed) is the difference between fiance and fiancée. Granted, this is a very domain-specific case, but points out that yes, gender can modify words in English other than pronouns and titles.
Not just pronouns, but nouns and adjectives too, and in some languages verbs.
Technically yes, but for the vast majority of English speakers, pronouns supervene on gender.
Which means gender should just be 3 choices, he, she, they. That's it. if a singular pronoun is used, its just changed to, "The person".