Hacker News new | ask | show | jobs
by foobarqux 453 days ago
This is just wrong. Languages follow certain inviolable rules, most notably, hierarchical structure dependence. There are experiments (Moro, the subject "Chris") that show that humans don't process synthetic languages that violate these rules the same as synthetic languages that do (specifically it takes them longer to process and they use non-language parts of the brain to do so).
3 comments

This does not mean that language in humans isn't probabilistic in nature. You seem to think that because there is structure then it must be rule based but that doesn't follow at all.

When a group of birds fly, each bird discovers/knows that flying just a little behind another will reduce the amount of flaps it needs to fly. When you have nearly every bird doing this, the flock form an interesting shape.

'Birds fly in a V shape' is essentially what grammar is here - a useful fiction of the underlying reality. There is structure. There is meaning but there is no rule the birds are following to get there. No invisible V shape in the sky constraining bird flight.

First, there is no evidence of any probabilistic processing at the level of syntax in humans (it's irrelevant what computers can do).

Second, I didn't say that, in language, structure implies deterministic rules, I said that there is a deterministic rule that involves the structure of a sentence. Specifically, sentences are interpreted according to their parse tree, not the linear order of words.

As for the birds analogy, the "rules" the birds follow actually does explain the V-shape that the flock forms. You make an observation "V-shaped flock" ask the question "why a V-shape and not some other shape" and try to find a explanation (the relative bird positions make it easier to fly [because of XYZ]). In the case of language you observe that there is structure dependence, you ask why it's that way and not another (like linear order) and try to come up with an explanation. You are trying to suggest that the observation that language has structure dependence is like seeing an image of an object in a cloud formation: an imagined mental projection that doesn't have any meaningful underlying explanation. You could make the same argument for pretty much anything (e.g. the double-slit experiment is just projecting some mental patterns onto random behavior) and I don't think it's a serious argument in this case either.

>First, there is no evidence of any probabilistic processing at the level of syntax in humans (it's irrelevant what computers can do).

There is plenty evidence for to suggest this

https://pubmed.ncbi.nlm.nih.gov/27135040/

https://pubmed.ncbi.nlm.nih.gov/25644408/

https://www.degruyter.com/document/doi/10.1515/9783110346916...

And research on syntactic surprisal—where more predictable syntactic structures are processed faster—shows a strong correlation between the probability of a syntactic continuation and reading times.

>In the case of language you observe that there is structure dependence, you ask why it's that way and not another (like linear order) and try to come up with an explanation. You are trying to suggest that the observation that language has structure dependence is like seeing an image of an object in a cloud formation: an imagined mental projection that doesn't have any meaningful underlying explanation.

No I'm suggesting that all you're doing here is cooking up some very nice fiction like Newton did when he proposed his model of gravity. Grammar does not even fit into rule based hierarchies all that well. That's why there are a million strange exceptions to almost every 'rule'. Exceptions that have no sensible explanations beyond, 'well this is just how it's used' because of course that's what happens when you try to break down an inherently probabilistic process into rigid rules.

> And research on syntactic surprisal—where more predictable syntactic structures are processed faster—shows a strong correlation between the probability of a syntactic continuation and reading times.

I'm not sure what this is supposed to show? If I can predict what you are going to say so what. I can predict you are going to pick something up too if you are looking at it and start moving your arm. So what?

The third paper looks like a similar argument. As far as I can tell neither paper 1 or 2 propose a probabilistic model for language. 1 talks about how certain language features are acquired faster with more exposure (that isn't inconsistent with a deterministic grammar). I believe 2 is the same.

> No I'm suggesting that all you're doing here is cooking up some very nice fiction like Newton did when he proposed his model of gravity.

Absolutely bonkers to describe Newton's model of gravity as "fiction". In that sense every scientific breakthrough is fiction: Bohr's model of the atom is fiction (because it didn't use quantum effects), Einstein's gravity will be fiction too when physics is unified with quantum gravity. No sane person uses the word "fiction" to describe any of this, it's just scientific refinement: we go from good models to better ones, patching up holes in our understanding, which is an unceasing process. It would be great if we could have a Newton-level "fictitious" breakthrough in language.

> Grammar does not even fit into rule based hierarchies all that well. That's why there are a million strange exceptions to almost every 'rule'. Exceptions that have no sensible explanations beyond, 'well this is just how it's used' because of course that's what happens when you try to break down an inherently probabilistic process into rigid rules.

No one is saying grammar has been solved, people are trying to figure out all the things that we don't understand.

>I'm not sure what this is supposed to show? If I can predict what you are going to say so what.

If the speed of your understanding varies with how frequent and predictable syntactic structures are then your understanding of syntax is a probabilistic process. A strictly non-probabilistic process would have a fixed, deterministic way of processing syntax, independent of how often a structure appears or how predictable it is.

>I can predict you are going to pick something up too if you are looking at it and start moving your arm. So what?

Ok ? This is very interesting. Do you seriously think this prediction right now isn't probabilistic ? You estimate not from rigid rules but past experience that it's likely I will pick it up. What if i push it off the table ? You think that isn't possible? What if i grab the knife in my bag while you're distracted and stab you instead? Probability is the reason you picked that option instead of the myriad of options.

>Absolutely bonkers to describe Newton's model of gravity as "fiction". In that sense every scientific breakthrough is fiction: Bohr's model of the atom is fiction (because it didn't use quantum effects), Einstein's gravity will be fiction too when physics is unified with quantum gravity. No sane person uses the word "fiction" to describe any of this, it's just scientific refinement: we go from good models to better ones, patching up holes in our understanding, which is an unceasing process. It would be great if we could have a Newton-level "fictitious" breakthrough in language.

"All models are wrong. Some are useful" - George Box. There's nothing insane with calling a spade a spade. It is fiction and many academics do view it in such a light. It's useful fiction, but fiction none the less. And yes, Einstein's theory is more useful fiction. Grammar is a model of language. It is not language.

> If the speed of your understanding varies with how frequent and predictable syntactic structures are then your understanding of syntax is a probabilistic process.

In what sense? I don't see how it tells you anything if you have the sentence "The cat ___ " and then you expect a verb like "went" but you could get a relative clause like "that caught the mouse". The sentence is interpreted deterministically not by what what follows after a fragment might contain but what it does contain. If you are more "surprised" by the latter it doesn't tell you that the process is not deterministic.

> Ok ? This is very interesting. Do you seriously think this prediction right now isn't probabilistic ? You estimate not from rigid rules but past experience that it's likely I will pick it up. What if i push it off the table ? You think that isn't possible. What if i grab the gun in my bag while you're distracted and shoot you instead?

I think you are confusing multiple things. I can predict actions and words, that doesn't mean sentence parsing/production is probabilistic (I'm not even sure exactly what a person might mean by that, especially with respect to production) nor does it mean arm movement is.

> "All models are wrong. Some are useful" - George Box. There's nothing insane with calling a spade a spade. It is fiction and many academics do view it in such a light. It's useful fiction, but fiction none the less. And yes, Einstein's theory is more useful fiction. Grammar is a model of language. It is not language.

I have no idea what you are saying: calling grammar a "fiction" was supposed to be a way to undermine it but now you are saying that it was some completely trivial statement that applies to the best science?

What exactly is wrong? The fact that grammars are very limited models of human languages? My key thesis is that human languages operate in a way that non-probabilistic models (i.e. grammars) can only describe it in a very lossy way.

Sure, LLMs are also lossy but also much more scalable.

I've spent quite a lot of time with 90s/2000s papers on the topic, and I don't remember any model useful in generating human language better than "stohastic parrots" do.

As I said there are universal rules that human language processing follows (like hierarchical structure dependence); you can't have arbitrary syntax/grammars. It's true that science hasn't solved the main puzzles about how to characterize these rules.

The fact that statistical models are better predictors than the-"true"-characterization-that-we-haven't-figured-out-yet is completely irrelevant, just as it would be irrelevant if your deep-learning net was a better predictor of the weather: it wouldn't imply that the weather doesn't follow rules in physics, regardless of whether we knew what those rules were.

> As I said there are universal rules that human language processing follows (like hierarchical structure dependence); you can't have arbitrary syntax/grammars.

GP didn't say anything about grammars being arbitrary. In fact, his claim that grammars are models of languages would mean the complete opposite.

I don't think they have a consistent understanding of the word "grammar": they seem to use it in the grade-school sense (grammar for English, grammar for French) but then refer to Chomsky's universal grammar which is different (grammar rules that are common to all languages).

The main point of contention is their statement that "grammar follows language" which, in the Chomsky sense, is false: (universal) grammar/syntax describes the human language faculty (the internal language system) from which external languages (English, French, sign language) are derived, so (external) languages follow grammar.

Yes, I was a bit vague. If we are to be serious then we would have to come with definitions of grammar-based approaches vs stohastic approaches.

All I am saying is that grammars (as per Chomsky) or even high-school rule-based stuff are imperfect and narrow models of human languages. They might work locally, for a given sentence, but fall apart when applied to the problem at scale. They also (by definition) fail to capture both more subtle and more general complexities of languages.

And the universal grammar hypothesis is just that - a hypothesis. It might be convenient at times to think about languages in this way in certain contexts but that's about it.

Also, remember, this is Hacker News, and I am just a programmer who loves his programming/natural languages so I look at everything from a computational point of view.

All this comes down to is that language is not a solved problem. By the same logic why not just stop doing any research in physics and just put everything through a neural net which is going to give better predictions than the current best theories?

The fact that a deep-neural-net can predict the weather better than a physics-based model does not mean that the weather is not physics-based. Furthermore deep-neural-nets predict but don't explain while a physics-based model tries to explain (and consequently predict).

Moro is apparently a reference to Andrea Moro, but I can't find any writing of his titled 'The Subject "Chris"'.
It's a separate study done by someone else:

https://www.youtube.com/watch?v=Rgd8BnZ2-iw&t=6735s