Hacker News new | ask | show | jobs
by sidarape 3558 days ago
> There is nothing about: part of speech categories, relative clauses, morphology, affixes or compound words, the theta criterion, the content/function distinction, verb tenses, agreement, or anything else related to actual linguistic phenomena.

When you learned you native tongue, you didn't need to know all of these. You just learned. So, maybe the problem IS about math and algorithms instead of linguistics.

2 comments

"When you learned you native tongue, you didn't need to know all of these. You just learned."

Learning your native language is not at all the same task as translating between languages. People who do translation are usually quite knowledgable in questions of grammar. This is particularly true of people who pick up languages later in life.

Even if you know multiple languages "natively", it's often difficult to translate accurately without thinking about grammar. Or, for that matter, to speak/write your own language with any degree of competency -- we study grammar in grade school for a reason.

> we study grammar in grade school for a reason

Parsey McParseface is probably better at grammar than 99% of people.

You certainly didn't need to know the names for them. That doesn't necessarily disqualify them from a role in the learning of language.

That said, you still may be right. But at the moment, it's a question of dogma, not empiricism.

Who's to say that neural nets aren't implicitly learning these concepts as they train? In fact, I'd be very surprised if nothing about the networks' internal state corresponded to linguists' models of how language work.

Nothing about focusing on the math and not the language means that these concepts don't have a role in how the models learn language. Just as nobody is told about parts of speech as they're acquiring their native tongue, we don't necessarily need to explicitly tell machine learning models about parts of speech in order for them to learn how to use that aspect of language correctly.