| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by p1esk 1839 days ago

instantly tell whether an arbitrary sentence is grammatical or not

You do realize we can train a neural network to perform this task? It is a binary classification problem. When I look at a grammatically incorrect sentence I don't do much symbolic reasoning - it just feels "wrong" to me. It does not match any patterns I have in my head for grammatically correct sentences. There's a lot of pattern matching in our thinking process.

What's missing in the current generation of neural networks is efficient information storage and ability to recall that information (e.g. lookup) or update it (direct write).

3 comments

make3 1838 days ago

"You do realize we can train a neural network to perform this task"

I'm doing a master's in deep learning for NLP and I'm not sure we can. Language modelling can't do this because grammatical yet semantically implausible combinations of words yield very low perplexity, like the classic being Noam Chomsky's "Colorless green ideas sleep furiously".

What would be a training set for this? I assume we would first try to do parsing to extract the grammatical role of each word. Then what would be the dataset? A massive attempt at generating the set of all possible trees that are grammatical?

I guess we could use massive textual datasets from reputable sources and extract their grammatical role tree, and learn from that. Generating negative examples with sufficient coverage would be very hard. Strict generative modelling without negative examples with good coverage would see the same problem as with language modelling, where acceptable but unlikely examples would have low perplexity despite being good.

It would seem to me that in order to generate negative examples with good coverage, your would need to have a man made program with a definition of what grammaticality means, which would make making a neural network useless to begin with.

Seems like the experts agree with my take: https://linguistics.stackexchange.com/a/1108

link

p1esk 1838 days ago

Constructing a training dataset is a separate problem. You could potentially crowdsource enough negative examples. Once you have the dataset, a neural network would most likely be able to learn to classify sentences with a reasonably good accuracy.

Unlike current DL models, humans have a world model (common sense) which is formed through an ability to create/update/lookup explicit rules/facts. Once we figure out how to incorporate that into a learning algorithm and/or a model architecture, AI will become a lot smarter.

link

alecst 1837 days ago

If we can train a computer to classify sentences as grammatical or not please let me know where. You’ll save the linguistics department a lot of money as they’ll no longer have to contact native speakers for this research.

link

xapata 1838 days ago

Humans require fewer examples to learn language rules. It's not clear that humans use the same learning model a "deep net."

link

p1esk 1838 days ago

Humans also require a lot of examples to learn a language - years of everyday practice for a young human. Learning algorithms are not the same, but you still need to train a large neural network - lots of neurons with lots of connections (weights) - whether it's in your head or in a datacenter.

link

alecst 1837 days ago

There’s some evidence that humans have a Universal Grammar and learn through deletion. And humans can not learn any old language — only a restricted class — meanwhile there’s no reason to think that an ML model would have that problem.

I’d encourage you to read a little more about the topic with an open mind. You might learn something.

link