Hacker News new | ask | show | jobs
by gpvos 748 days ago
As far as I know it's not English or any Western entity that has grouped the Chinese languages together as one, but the Chinese government, for political reasons. Western linguists recognize the variants of "Chinese" as different languages.
5 comments

There's a saying in linguistics that "a language is a dialect with an army and a navy" [0]. Linguists recognize that where you draw boundaries between languages is essentially arbitrary, even more so than boundaries between biological species. It tends to be that a language is a language if and only if some sovereign state declares it to be so, otherwise it's a dialect.

This is also how you get Portuguese as a distinct language from Spanish even though the two are more mutually intelligible than Scots (a "dialect") and American English are. Portugal has the sovereign government to back up its claim to having its own language where Scotland does not.

[0] https://en.wikipedia.org/wiki/A_language_is_a_dialect_with_a...

I like the expression, but the example of Portuguese/Spanish is absurd IMO. As a Portuguese speaker, the amount of effort required to communicate with Spanish speakers is very, very high, to the point where I avoid trying at all costs. Here in Texas, it is almost always more effective for my family to communicate with Spanish speakers using very broken English and hand gestures on both sides than trying to get any Portuguese-Spanish mutual intelligibility to work.
But the comparison was to Scots, which is (sometimes, not universally) considered a dialect of English rather than a separate language, but is hard for standard English speakers to understand. It's not just English with a Scottish accent. I have no idea how Portuguese feels to Spanish speakers or vice versa, but here's an example of modern Scots from Wikipedia. I'm curious how it compares.

(Edit: And here's a spoken example - https://youtu.be/am1MCJsEGYA)

> Noo the nativitie o' Jesus Christ was this gate: whan his mither Mary was mairry't till Joseph, 'or they cam thegither, she was fund wi' bairn o' the Holie Spirit. Than her guidman, Joseph, bein an upricht man, and no desirin her name sud be i' the mooth o' the public, was ettlin to pit her awa' hidlins.

>But as he had thir things in his mind, see! an Angel o' the Lord appear't to him by a dream, sayin, "Joseph, son o' Dauvid, binna feared to tak till ye yere wife, Mary; for that whilk is begotten in her is by the Holie Spirit.

> "And she sall bring forth a son, and ye sal ca' his name Jesus; for he sal save his folk frae their sins."

> Noo, a' this was dune, that it micht come to pass what was said by the Lord throwe the prophet,

>"Tak tent! a maiden sal be wi' bairn, and sal bring forth a son; and they wull ca' his name Emmanuel," whilk is translatit, "God wi' us." Sae Joseph, comin oot o' his sleep, did as the Angel had bidden him, and took till him his wife.

> And leev'd in continence wi' her till she had brocht forth her firstborn son; and ca'd his name Jesus.

As a native speaker of English and having conversational ability in Spanish I would describe both Scots and Portuguese as separate languages. Portuguese feels like it has as much in common with Spanish as Italian or French to me, and I can't remotely carry on a conversation in Portuguese. (Or Scots really, though with the somewhat mutual intelligibility I can speak English or Spanish and maybe that's workable, but I'm definitely not going to understand the Portuguese.)
Which dialect of Portuguese are you referring to?

I'm a fluent second-language Spanish speaker and have had a lot of success communicating with native Portuguese speakers, but only Portugal Portuguese and various African Portugueses. I can't understand Brazilian Portuguese at all.

I am Italian and whenever I go to Spain I usually don't really need to speak English because the languages are close enough that you can go by by just knowing a handful of basic words (and the Spaniards I meet usually prefer it that way). This is both a blessing and a curse; all Italians I met living in Spain (and viceversa, all Spanish-speakers I met in Italy) tend to have a hard time learning the other language "properly" because the threshold for being understood is extremely low. If, perchance, someone speaks an Italian with Spanish grammar, people will still understand you perfectly.

Given that Castillan and Portuguese are even closer (both Western Romance, part of a linguistic continuum, ...) I find it very hard to believe that honestly. I am only familiar with the European variants, thought. Maybe the issues you faced are due to how the Latin American variants have diverged significantly over the years?

The big difference between ES and PT is the accent/pronunciation of letters and matching words. Secondarily, is the differing vocabulary. But a lot of these are still understood as archaic/uncommon alternative words.

(see shoen's post below.)

So, if you learn the accent of the other language, all of a sudden a large portion of the language is unlocked. This happened to me, almost like a light switch.

I don't have a lot of experience with Italian but it seems like the pronunciation is closer to Spanish.

> I don't have a lot of experience with Italian but it seems like the pronunciation is closer to Spanish.

Yeah the phonetics are very close. Castillan has more fricative sounds like [ð], [θ], [x] and [β] and no open vowels, but that's it.

I speak fluent second-language Spanish and have had next to no difficulty communicating with native Portuguese speakers from Mozambique, Cabo Verde, and Portugal. What variety of Portuguese do you speak?

I'll concede that it's possible that I actually have an advantage as a second-language speaker, since my Spanish is probably slower than a native's and when I'm listening I'm already doing more work than a native is accustomed to.

Brazilian Portuguese has some phonological differences that I think confuse people in both directions more than other varieties of Portuguese, like the /tʃ/ and /dʒ/ for <t> and <d> in various contexts. For example a Spanish speaker would probably have a hard time recognizing that Brazilian Portuguese /dʒi'abu/ is cognate with Spanish <diablo>. A Brazilian Portuguese speaker who was less familiar with Spanish might similarly have a hard time recognizing /ˈdjablo/ as cognate with Portuguese <diabo> 'devil'.

Or Brazilian /'sedʒi/ is cognate with Spanish <sed> 'thirst'. A Spanish speaker will have to know to effectively ignore the /ʒi/ in order to recognize the word easily!

Maybe more extreme, Brazilian /'hedʒi/ (written <rede>) is cognate with Spanish <red> 'net, network'.

You might also be familiar with a greater variety of Spanish pronunciations as a non-native speaker... if you know Argentine /'ʃubja/ and /'ʃabe/, then you have a better chance to recognize Brazilian /'ʃuvɐ/ and /'ʃavi/ ('rain' and 'key', respectively).

Yeah, I suspect that OP speaks Brazilian Portuguese but I didn't want to assume.

I should have specified in my original post, but I only meant that Portugal Portuguese (and at least a few of the African varieties that are still very close to Portugal's) are mutually intelligible with Spanish. Which actually just further illustrates the complexity of categorizing speech into discrete languages...

Interestingly, it makes Brazilian much easier to understand to (many) Italians and Romanians.
a friend of mine who grew up in argentina went for an interview at a university in brasil.

they reported that on the first day, portugese was just gibberish. on the second day they realized they could read a solid chunk of a portugese newspaper. on the third day they felt they were beginning to understand what people were saying to them.

obviously, YMMV (and does).

This really what "mutually intelligible" means, I'd say: that the languages are so close you can sort of work it out without explicit instruction. You still need some experience with the other language - and quite often there'll be a geographical and cultural proximity that means almost all speakers have that.

I grew up in a Scandinavian country and visited the others a lot when I was young, and I find I understand most of what I hear in the other languages, but it's quite common for my peers who don't have that experience to understand nothing.

Probably not as absurd as you think. I reckon if you dropped an American in a random town in Scotland (or even a northern English town, for that matter), they would also need to use very broken English and hand gestures to communicate as well. Glaswegian or Geordie is near incomprehensible to RP speaking Brits, yet alone to an American who's only exposure to Scottish is Mel Gibson as William Wallace.
> Here in Texas ...

I may be way off here, and happy to be corrected. My experience is Texas-Spanish is difficult to use in Spain, and would guess the inverse is true. Which I would deduce making Portuguese-Spanish a non-starter in the state.

I know a very limited amount from having grown up and played soccer in the "Mexican" rec leagues in Tx. While traveling to Spain, English is perfectly fine in cities. But days trips to smaller towns/villages they had more trouble understanding my attempts to communicate with the basic texas-spanish I had picked up, than they did the hand gestures and single english word here and there. I understood next to nothing in Portugal (it might as well had been Dutch to my ears; I had no idea until now that they are kinda similar in the way Spanish/Italian is). Of course, this could be that I'm simply horrible at Spanish. But have heard Texas-Spanish is even weird for Baja-California-Spanish speakers.

Spain Spanish and <pick-latam-country> Spanish are the same language with very different vocabulary.

(Well, not quite, because Spain Spanish has loismo and what not that <pick-latam-country> Spanish almost certainly does not, and there's other variations as well, like Argentine Spanish having very different imperatives, Argentine and Colombian voceo vs. tuteo everywhere else, etc.)

Given the context, you'd probably have an easier time talking Portuguese with someone from Vigo than, say, Juarez, but even then, that might depend on you not having a Brazilian dialect...

After all, Spanish & Brazilian speakers in the new world have their own dialects (not languages).

I don't thinm the intention was to paint Spanish and Portuguese are incredibly similar, only to say that they're more similar than Scots and English which are still considered the same language.
A classic example of this is Hindi and Urdu -- the two languages are largely mutually intelligible when spoken, which is the main criterion for being the same language, but are written with different scripts and of course used in separate and adversarial states.
No, China literally has multiple dialect continuums[0]. It is the case of "a language is..." saying but the "other" way around.

0: https://en.wikipedia.org/wiki/Dialect_continuum#Chinese

I don't think we actually disagree.

I'm saying that because China has one single army and navy and at the same time a huge narrative wrapped up in the idea that it's all one China, those "dialects" don't get to be languages because the army and the navy say otherwise.

"A language is a dialect with an army and a navy" implies its corollary, which is that "a dialect is a language without an army or a navy".

(In fact, that's likely what was originally meant by the person who coined the phrase—he was a specialist in Yiddish linguistics writing during WW2.)

It's perhaps how it is seen and used in English. But in China Chinese languages tend to be referred as such with Mandarin referred to as the "common language", etc though the character used has an oral connotation.
I think our disagreement is in whether there can be fault lines of mutual intelligibility bewteen dialects. If liguists and Chinese languages speakers are to be believed(no particular reasons not to), there are in China.
I don't disagree that there can be fault lines of mutual intelligibility between dialects. I'm not even commenting on how we define dialects at all—all I'm saying is that the distinction between a dialect and a language is an arbitrary one that is made for political reasons more than linguistic ones, and that's something that even the sources for that Wikipedia page agree with me on. For example (emphasis added) [0]:

> The debate as to whether or not the varieties of speech used by the Chinese should be classified as separate languages or dialects of one language is a difficult one, with reasons on both sides. The main criterion according to which some scholars tend to use the English term 'language' for the varieties of Chinese, is the lack of mutual intelligibility between the various forms of speech, the fact that the "various 'Chinese dialects' are as diverse as the several Romance languages". On the other hand, since there are no extra-linguistic (political, historical, geographical, cultural) reasons to treat these dialects as individual languages, the tradition is to call them dialects of Chinese.

In the absence of nation states I suspect that we'd mostly talk about dialects and dialect continuums. Discrete languages are only really relevant as a concept because of the non-linguistic ties that bind a nation together.

[0] https://books.google.com/books?id=lCgnrA7Ke3QC&pg=PA1&source...

Reminds me of when I went on business trip of several weeks to Sweden from California. The Swedes spoke English reasonably well. I then went to Scotland for vacation and had a much harder time understanding their dialect than I did with the Swedes.
For me as a non-native speaker of English and German, this is quite normal - I mostly have an easier time understanding other non-native speakers since they usually use "international" dialect/pidgin, speak slower and usually articulate more distinctly.
I could believe this is true if you’re only comparing languages that have the same root or parent language such as Latin languages, etc.

But I don’t see how anyone could describe the difference between Chinese and English as arbitrary or as two dialects even if the apocalyptic collapse of all major nations which spoke such languages occurred tomorrow.

My understanding is that theres something called lexical similarity and if it’s over a certain percentage it’s a dialect.

What's arbitrary isn't that languages are different from each other, what's arbitrary is where you draw the line. When you take two languages on opposite sides of the world they're unquestionably different languages. But as you transition slowly from one language to another, how many languages you spin off and which dialects fall under which languages is arbitrary.

> My understanding is that theres something called lexical similarity and if it’s over a certain percentage it’s a dialect.

Even if you tried to use a method like this to draw lines, it requires you to pick a "center" dialect that you compare all other prospective dialects/languages to. Which dialect you pick as your "center" dialect will determine which dialects end up under your umbrella language, and picking a different center would yield very different results. Which language you pick as your center is inherently a political question, one which would be settled by a sovereign state.

And aside from that problem, lexical similarity is not used to define languages. All it measures is how similar word sets are, and language variations are way more complicated than just vocabulary. No serious linguist would ever try to use a single metric like that to draw lines between languages (and again, most serious linguists aren't actually interested in drawing general-purpose lines because they understand that the lines are not real).

How does that work with e.g. French Creole which has French, Carribean, and English in it. What if this feels like a dialect but the percentage of any given parent is less than your cut-off percentage? You make the rule sound very easy to interpret but I think the general principle is that language classification is nuanced and the irony of the "navy and army" language requirement are it kind of has nothing to do with the actual language spoken.
The "navy and army" argument is usually employed when the question arises whether something is a dialect or a separate language. IMHO such Creoles should also be classified as languages, with the caveat of dialect continuums.

Creole is a weird case IMHO because English itself is pretty much a creole between Old English, Norman French, Norse, and some Gaelic and Pictish languages.

On a recent holiday to to Hong Kong {sp}, I noted the Metro announcements were in Cantonese, Mandarin and then English. I also noted that apologies and other minor social skirmishes and interactions, eg after bumping into someone, were mostly delivered in English.

When in Rome, speak English: its the French language.

Hong Kong is a special case where English still has higher prestige to begin with. Locals sometimes code-switch to English even if everyone present would understand Cantonese.
That sounds like a carefully worded response and I will respect it as such.

Language is important in so many ways - it often defines who you are, regardless of where you are.

Franco-Germanic, por favor.
Here in China, linguists consider the different Western "languages" to be dialects, and believe that the Western governments, for political reasons, make people think they speak different languages than their neighbors, so that they cannot unite.

I'm just joking, but what you say is as absurd as my joke. Western linguists don't consider the dialects different languages. If they do, they do it for political reasons. Accept that there are different ways of thinking and the real world never has to submit to how you define concepts like "a language", and not everything China surprises you with has something to do with politics.

Edit: I realized that my joke was closer to reality than parent's comment: https://en.wikipedia.org/wiki/Serbo-Croatian

Open WALS or Glottolog or any other language catalog and you will see they categorize Chinese as a language family, consisting of multiple languages like Mandarin, Wu/Hui, Min, etc. You are free to disagree of course, but "Western linguists don't consider the dialects different languages" is simply not true to my knowledge.
> Western linguists don't consider the dialects different languages. If they do, they do it for political reasons.

Western linguists generally view the concept of a "language" as being a political one more than a linguistic one, and so rather than quibble about definitions they just use whichever word the people who speak the language/dialect would use. For example, from a book about Chinese dialects [0]:

> The debate as to whether or not the varieties of speech used by the Chinese should be classified as separate languages or dialects of one language is a difficult one, with reasons on both sides. The main criterion according to which some scholars tend to use the English term 'language' for the varieties of Chinese, is the lack of mutual intelligibility between the various forms of speech, the fact that the "various 'Chinese dialects' are as diverse as the several Romance languages". On the other hand, since there are no extra-linguistic (political, historical, geographical, cultural) reasons to treat these dialects as individual languages, the tradition is to call them dialects of Chinese.

The Chinese language varieties are dialects because it is politically expedient for them to be so. The Western Romance languages are languages because it is politically expedient for them to be. Linguists shrug and move on to more interesting (to them) questions.

[0] https://books.google.com/books?id=lCgnrA7Ke3QC&pg=PA1&source...

Serbo-Croatian is a special case. We all know that we speak the same language, but ‘the others’ are naming it wrongly.

Also, Slavic languages are very difficult for non-Slavs to learn, but very easy for other Slavs. And many are mutually intelligible.

They are different languages in the sense that they are not mutually intelligible, but the prestige of Mandarin is not a CCP one. Mandarin as it evolved has always been the language of government and of the plains. Other Sinitic languages routinely borrow readings and terms from Mandarin. There's more that I want to say about the topic, but it's less relevant.
They're written the same, and have basically the same grammar. The characters have hugely different readings, but people can communicate easily in writing. You can call that different languages, but that's certainly a different kind of different than people would expect when you say different.

If there were a version of English where all of the letters designated completely different sounds, but was written exactly the same way, would it be a different language? Would people who said that they were dialects of the same language have to be saying this for political reasons?

edit: I mean, Chinese is how you would expect it to be. How would two people living extremely far apart in China even know how each other would pronounce a particular character? How would they have communicated those sounds 500 years ago? The wide variance in the pronunciation of words even in English is also due to our dogshit orthography (largely imposed by the French), which often fails to give a decent hint for how to say something. Chinese characters are symbols of concepts that usually have a hint of what it's meant to sound like in the northern dialect, 1500 years ago, by referring to another character that there's no reason one would know what it sounded like.

> How would two people living extremely far apart in China even know how each other would pronounce a particular character?

China had an imperial bureaucracy for over 2000 years, which sent officials from one end of the country to the next. In fact, a predecessor of Standard Chinese (a.k.a. "Mandarin") was called "the language of officials" (官話).

The phonology differs. Vocabulary differs. Grammar differs. Speaking Cantonese and Mandarin natively, I have no idea what Hokkien or Sichuan people say, whether or not you write it down.

This is especially apparent when speaking to less educated people with less exposure to the standardised, official Chinese language, which is what people do actually write down when intending for a broader audience, of course. Diglossia is real.

Yep. Anybody who’s ever read written Cantonese or Shanghainese would realise they are often unintelligible unless you speak those languages and understand how they’re written. eg 「佢冇做乜嘢」

And yet the incorrect parent comment has been voted to the top of the thread by those who think it’s helped them.

> The wide variance in the pronunciation of words even in English is also due to our dogshit orthography (largely imposed by the French), which often fails to give a decent hint for how to say something.

Others have already corrected your other misunderstandings, but this is also false. Spanish has at least as much variance in pronunciation as English and has an orthography that is extremely regular. Brazilian Portuguese and Portugal Portuguese likewise have the same, highly regular orthography and are barely mutually intelligible.

To the best of my knowledge you actually have the causality mostly reversed: English's orthography is useless largely because the pronunciation changed but the spelling didn't, and English has a variety of pronunciations because the pronunciations changed differently in different regions. English has a messier orthography than other languages because of our complicated history of borrowing words, but the evidence shows that even people who start with a highly consistent orthography don't use it to keep their pronunciation static and shared.