Hacker News new | ask | show | jobs
by slurgfest 4848 days ago
As English, this stuff is totally incomprehensible and unusable. Absolutely nothing is conveyed to actual human English speakers by saying the word 'buffalo' 400 times in a row.

If it is 'grammatical' then it is grammatical by virtue of conforming to some idealized grammar. But when this grammar is so far off not just from anything people say, but anything they can actually understand, it really only means that the idea that this grammar models real English has been reduced to total absurdity.

7 comments

They're tricks, but they're far from incomprehensible. In my experience, both with this sentence and the buffalo one, there's a certain mental click when one "gets it", after which the sentence makes sense and one can "feel" its grammatical structure. It's a curious and rather Chomskyan experience. Before that, of course, the notion that such a string of words might mean anything is absurd. "Getting it" is much like those 3D visual puzzles where at first you see only a noise pattern, but when you hold it at the right distance and let your eyes refocus a certain way, a picture leaps out at you.
Don't take it too literally. It's meant as a kind of linguistic koan to illustrate the concept of prosody.

http://en.wikipedia.org/wiki/Prosody_(linguistics)

Consider the difference in how one says

    He ate that
vs

    He ate that?
That's prosody. The difference in meaning isn't a consequence of the presence of the question mark, although that's what people think of as "grammatical." The presence of the question mark and the difference in meaning are both consequences of the prosodic differences between the two sentences.

This shows one of many challenges inherent computational linguistics: "the written word" only encapsulates a small part of what it means to "speak English" or "understand English."

As a nice benefit, it makes the sort of grammarian who obsesses over the written word look (rightly) like they're missing the forest for the trees.

And further to that, there is a difference between "He ate that?" (He did what to that?) and "He ate that?" (He ate what?)
While you have a point in there, some things are still missing. In my opinion the buffalo sentence doesn't have enough prosody to ever make the verbal version intelligible without explanation. The sentence in the OP does, but it's also missing mandatory punctuation. Punctuation conveys a solid fraction of the information prosody does, and sometimes even contains information prosody doesn't.
Actually, I find it perfectly comprehensible when said aloud with the right emphasis and pacing - it's only difficult to understand when written down with no punctuation.
You may also be interested in http://en.wikipedia.org/wiki/Garden_path_sentence, which has some sentences that are clearly grammatical but not what they seem at first glance.
I'm not convinced this one is even grammatical. At minimum, omitting the semicolon should make it a run-on sentence.
The trick to all of these sentences is to omit otherwise necessary punctuation. The rare exceptions are sentences describing recursive concepts. For instance, if you have a radar detector, the police will catch you because they have a radar detector detector, which makes it imperative that you own a radar detector detector detector.

Here's another example: who polices the police? If there were any one agency in charge of that, certainly we would call them the police police. But who polices the police police? Clearly, the police police police police the police police.

There may be a lack of pronunciation but written English has clear rules demanding punctuation. The use of quoted words without quotation marks in the OP 'sentence' is ridiculous.

Buffalo buffalo and radar detector detector detector are much more valid.

Ugh, I meant punctuation. (Edited my comment to reflect that as well, but let the record stand that it originally had "pronunciation".)

My brain puts the words "punctuation" and "pronunciation" in the same hash bucket so I can never get the right one out reliably.

Oh. Well the Buffalo sentence doesn't omit punctuation, yet manages to be perfectly confusing. :)
The article title is ungrammatical without correct punctuation. With correct punctuation, it's fine - and it is, after all, talking directly about an error in the very grammatical construct it is highlighting. It's a perfectly natural sentence that could easily come about in normal discussion.

The 'buffalo' one is just nonsensical - the word 'buffalo' just isn't used that way, and even with punctuation, needs to be separately explained for people to understand it - even if they are aware of the regional dialect that uses the word 'buffalo' as a verb.

If English was a workable language, English majors would have nothing to base their theses on. The 'had had had' exercise highlights the absurd nature of English. This philosophy on English is why it stopped evolving after its peak - post English after the death of the worlds greatest playwright Shakespeare (who wrote phonetically, may I add).

On the other hand it does give an insight into syntax trees and parsing.

Every language has absurdities. Genders for non-gendered things is one example.

What's really absurd about English is the contempt for diacritical marks. Other languages give you a clue as to how the word is pronounced, whereas in English, if I write 'wind', you don't know if I'm talking about air blowing or charging a mechanical clock unless you have context - which may come later in the sentence.

> absurdities: Genders for non-gendered things is one example.

This. I never understood how e.g. Spanish speakers think that a door is female or a clock is male. I mean, it's not like there are any body parts you can examine for a definitive answer, or clothing and mannerisms which let you make a pretty good guess...I never really got a satisfactory answer other than "it's usually -o or -a, but not always; really, you just have to memorize it." Seriously...WTF?

> diacritical marks

Other European languages love them. To me as an English speaker, they look like misplaced inkspots or dirt on my monitor. I never had any class in school or college that taught what they mean [1]. I blithely type "fiancee," "naive" and "Geiger-Muller," since I don't want to get out Character Map or whatever the Linux equivalent is [2], and I'm not really sure which marks to use or where to put them. I pretty much pretend they don't exist, unless they cause compiler errors [3], in which case I terminate them with extreme prejudice.

[1] I did once learn that an overline (a line above a character; I don't know if that's actually what it's called) means a long vowel sound, and an upside-down e means schwa. I haven't seen either of these used outside dictionary pronunciation keys.

[2] In my current operating system, Linux Mint, I don't even know how to get those characters other than copy-and-pasting the Unicode text somebody else has put on a webpage, or spending an hour or two sitting down with the RFC's that specify UTF-8 encoding and a hex editor. The only reason I know on Windows is that I eventually stumbled on Character Map by curiously exploring all the menus. This may give you a clue how often I deal with international text

[3] http://news.ycombinator.com/item?id=5316875

"I never understood how e.g. Spanish speakers think that a door is female or a clock is male. I mean, it's not like there are any body parts you can examine for a definitive answer..."

But "gender," as a term in grammar, just means an arbitrary classification of words for grammatical purposes. It's only a few languages which, absurdly, map these grammatical tags to biological or cultural sex distinctions. English compounds the absurdity by retaining this grammatical distinction only in this one bizarre case.

(My favourite example of the arbitrary nature of grammatical gender is Dutch. Dutch has two genders, common and neuter, which, if you wanted to map them to sex, would mean "either male or female" and "neither male nor female", respectively.)

English used to have diacritical marks, specifically the diaereses, as can be seen in names like Zoë or in the surname Brontë (as in the family of English authors.)

The New Yorker loves the diaereses to this day, and frequently uses it in words like "coöperate".

http://en.wikipedia.org/wiki/Diaeresis_(diacritic)

I made the realisation when I went to Vietnam, where basically you right a sentence then shake a bagful of diacritics over it. I thought "Heh, English doesn't require any of that nonsense... hey... wait a minute..."
> What's really absurd about English is the contempt for diacritical marks.

Not just English, but also Chinese. Pinyin was originally specified as having marks over the vowels:

  āáǎà ēéěè īíǐì ōóǒò ūúǔù üǖǘǚǜ
But although you see pinyin used a lot in mainland China alongside Chinese characters, you virtually never see those diacritics, just:

  a e i o u v
where v is used instead of ü.