Hacker News new | ask | show | jobs
by Tainnor 2141 days ago
You're confusing two separate things: linguistic complexity and ambiguity.

Linguistic complexity is hard to measure, but it's not hard to show that at least morphologically, English is undercomplex compared to many other languages.

This doesn't necessarily mean that English is more ambiguous, though. Unlike German, English typically has very rigid word order, so in the context of a sentence, you'll know if a specific word is a noun or a verb.

The problem here is that many NLP models inadequately capture syntactic structure.

1 comments

Sorry if it was confusing, I really wanted to mention both a) lexical ambiguities b) syntactic ambiguities as possible obstacles for NLP.

> Unlike German, English typically has very rigid word order, so in the context of a sentence, you'll know if a specific word is a noun or a verb.

So you say you are able to guess from the word order what part of speech a particular word is. But with German you hardly need all this guessing.

If you compare two marginal examples: - English "time flies like an arrow" - German "Wenn Fliegen fliegen hinter Fliegen..."

you'll find out the English one has way more possible interpretations.

> So you say you are able to guess from the word order what part of speech a particular word is. But with German you hardly need all this guessing.

Not really. It's not about guessing: in English, the part of speech really is mostly determined by its syntactic structure.

> If you compare two marginal examples: - English "time flies like an arrow" - German "Wenn Fliegen fliegen hinter Fliegen..."

Not sure what you're trying to say here. The English example is ambiguous, yes (and only strictly grammaticaly; semantically the meaning is clear, unless you're using it in the phrase "time flies like an arrow, fruit flies like a banana", which is meant as a linguistic joke). It's also very easy to come up with examples of phrases or sentences that are ambiguous in German, or in any language for that matter. Here are some fun examples:

"Er liest das Buch seiner Schwester vor" (could either mean "he's reading the book to his sister" or "he's reading his sister's book to someone")

"der weiße Schimmel" ("white mould", or "white horse")

"wilde Tiere jagen" ("to hunt wild animals", or "wild animals are hunting")

and don't even get me started on the ambiguity of compound words or phrases with a genitive, where there are often tons of potential interpretations depending on the intended relationship between head and dependent noun.

And also the German example you gave (fully: "Wenn Fliegen hinter Fliegen fliegen, fliegen Fliegen Fliegen nach", or "if flies fly behind flies, flies fly after flies") is a) another joke sentence nobody uses in practice, and b) is exactly a case where you can only distinguish the part of speech (and the grammatical case) of a word from the syntactic structure and not from its morphology, something you claimed doesn't happen in German, but here it clearly does.

Look, you may make a case that it's easier for English sentences to be ambiguous than for some other languages, but I would need to see some good data before I believed that claim, because it's just not something that is immediately obvious.

I still think you're missing my point, although I am impressed by your German skills ("der Schimmel" is BTW just a homonym, it's hardly related to the topic of syntactic ambiguity).

> is a) another joke sentence nobody uses in practice, and b) is exactly a case where you can only distinguish the part of speech (and the grammatical case) of a word from the syntactic structure and not from its morphology, something you claimed doesn't happen in German, but here it clearly does.

I didn't make such a strong claim. All I wanted to say in German syntactic ambiguities are much less of a problem than in English. I've brought two anecdotal evidences to let you compare possible ambiguities in both of them, these two are indeed nothing but jokes.

But let's take a closer look at them once again.

a) "Time flies like an arrow": the word "time" can be 1) a noun 2) an adjective 3) a verb in declarative form 4) a verb in imperative form. This gives us a factor of 4 on the very first word of the sentence.

b) "Wenn Fliegen hinter Fliegen fliegen" - ambiguitity exists just between "fliegen" as a verb and "Fliegen" as a plural noun, thus the "ambiguity factor" of the word "f/Fliegen" is just 2.

> but I would need to see some good data before I believed that claim, because it's just not something that is immediately obvious.

Fair enough.