Hacker News new | ask | show | jobs
by drivenextfunc 606 days ago
As a relatively well-educated Japanese native speaker, I too experience this problem when writing Japanese on paper - being unable to write many kanji characters by hand. I am no exception among Japanese native speakers. While the author seems to interpret this problem as something crucial, I question whether it truly is.

The orthography of Mandarin and Japanese includes an alphabet consisting of thousands of characters, the majority of which comprise dozens of strokes. Although East Asian people have higher IQ scores on average, we are not superhuman - our memory capacity is bound by human limits, and the decreased frequency of actually writing kanji on paper has naturally resulted in our forgetting how to write many of them. Is this surprising?

Furthermore, orthography is not part of language in a fundamental sense - it's merely a useful tool that accompanies a language. Therefore, I do not see the writing system becoming less stable as a significant issue. Consider Korea as an example: they used to use kanji in their orthography but have almost completely eliminated it with virtually no adverse effects. While laypeople often assume orthography is an integral part of a language, this is just not the case from the linguistic perspective.

8 comments

If you consider that a lot of people using the Latin alphabet does use the cellphone autocomplete to check how to write a word used infrequently...

So I would say this text is biased by the "western" view of the writer, something that could be categorized as "Orientalism". A study about this phenomenon is valid, is important. But this post is not a good study.

But autocomplete even for basic words? My wife is Chinese. I'll never forget when she was helping her family write some formal letter in Chinese in Microsoft Word and she simply could not input the numbers 1, 2, and 3 in Chinese because she forgot how. And I know this may be apples and oranges because this is keyboard input versus writing on paper but as a programmer who can type at a moderate pace since I was a kid (~120wpm) this was perplexing for me! And similar to the article, she's an Ivy league grad. Similarly, when she's communicating with her family via WeChat half the time she simply sends audio messages instead of text messages. I'm pretty surprised this method is so popular instead of some voice-to-text google assistant type system.
I think there may be some confusion. The standard Chinese characters for 1, 2, and 3 (一, 二, 三) are among the simplest characters in Chinese: literally just one, two and three horizontal strokes. These would be extremely difficult to forget! What your wife was likely trying to write were the special variants (壹, 贰, 叁) that are used on checks, official documents, etc. These were specifically designed to be hard to alter or forge (think the difference between writing "100" versus "ONE HUNDRED" on a check). Even highly educated Chinese people might need to look these up since they are specialized characters not used in everyday writing.
That explains it. Yup these were some sort of official / govt documents. Thanks for the explanation!

Edit. I should have realized that. I just came back from China and my kids were watching a children's show with the following subtitles: "一二一二一二一二一二一二一二一二一二一二". Took me a while to realize the subtitles were not broken. The characters were marching chanting "one two one two..." :)

I think this is specifically more an IME (input method software) issue than a typing one. Japanese has similar "official" numbers (壱, 弐, 参, maybe some of the few cases where modern Japanese is more simplified than Simplified Chinese). These numbers couldn't be easier to type. I just type 1, 2, 3 (i.e. the digit keys on top of my keyboard), hit the convert key and select the right character (I also get offered 三, ③, 3⃣,³ and several other options to choose from). That's it.

I tried the same with Google's IME and I couldn't use digits as input, like the Japanese IMEs let you do. I could find the character for 叁 quickly enough, but 壹 was only on the second or third page. Still, I suck at Chinese and I found it.

Now, writing these characters is an entirely different story. I think any character that's rarely written and appears only in one common word runs the risky of being forgotten, even if that word is quite simple and used on a day-to-day basis. A word like 喷嚏 (sneeze) in Chinese or 薔薇 (rose) in Japanese fit the bill.

The Japanese fallback, in case you forgot the character is quite simple: you'd just use either Katakana or Hiragana with different connotations[1]. I'm not quite sure what the fallback would be in Chinese, but I guess that would often be picking another character with a close or same pronunciation, as Chinese speakers often do on purpose as a sort of pun.

I also expect there are still fewer cases of "character amnesia" in China than Japan, since the fallback mechanism is simpler and more standardized in Japan, and children are taught far less Kanji in school than their counterparts in Mainland China, Hong Kong or Taiwan.

[1] While Hiragana gives a familiar connotation, writing the word as バラ in Katakana is "more official", if anything, since names of flora and fauna are conventionally written using Katakana in official contexts, especially when you want to use the exact scientific name. This is the equivalent of using Latin names in Western countries, e.g. Rosa hirtula would be サンショウバラ.

>The standard Chinese characters for 1, 2, and 3 (一, 二, 三) are among the simplest characters in Chinese: literally just one, two and three horizontal strokes.

Does that work for larger numbers, keep adding strokes?

No, 4 is 四. Numbers are simple characters, but only 1,2,3 are made by adding strokes.
I am not from Asia so I would trust more what our wife has to say than me. But I would argue that it is common for people living in a country with different language from they native language to forget how to write or even say some simple words. There's a good active effort to learn a new language.
It might be surprising but, in terms of written words, sneeze (喷嚏) is not "basic".
That's very much the impression I get. I've never seen pinyin used in Chinese writing, and the Chinese friends I've met have said they've never seen it either (they said they'd probably just look up the character or write a homonym instead, but even then it's pretty rare that it comes to that).

That's not to say it's never done, but it feels like an outlier. As if a friend found a word too hard to understand and drew a picture instead, and then the author wrote an article about how spelling is so difficult that it leads English speakers to draw words instead of writing them.

But the thing that struck me the most was just how confused people were when I asked them about it. It just didn't seem to be anything that was an actual issue for them.

> "This is such a gratifying experience, in fact, that I have actually kept a list of characters that I have observed Chinese people forget how to write. (A sick, obsessive activity, I know.) I have seen highly literate Chinese people forget how to write certain characters in common words like "tin can", "knee", "screwdriver", "snap" (as in "to snap one's fingers"), "elbow", "ginger", "cushion", "firecracker", and so on. And when I say "forget", I mean that they often cannot even put the first stroke down on the paper. Can you imagine a well-educated native English speaker totally forgetting how to write a word like "knee" or "tin can"? Or even a rarely-seen word like "scabbard" or "ragamuffin"? I was once at a luncheon with three Ph.D. students in the Chinese Department at Peking University, all native Chinese (one from Hong Kong). I happened to have a cold that day, and was trying to write a brief note to a friend canceling an appointment that day. I found that I couldn't remember how to write the character 嚔, as in da penti 打喷嚔 "to sneeze". I asked my three friends how to write the character, and to my surprise, all three of them simply shrugged in sheepish embarrassment. Not one of them could correctly produce the character. Now, Peking University is usually considered the "Harvard of China". Can you imagine three Ph.D. students in English at Harvard forgetting how to write the English word "sneeze"?? Yet this state of affairs is by no means uncommon in China. English is simply orders of magnitude easier to write and remember. No matter how low-frequency the word is, or how unorthodox the spelling, the English speaker can always come up with something, simply because there has to be some correspondence between sound and spelling. One might forget whether "abracadabra" is hyphenated or not, or get the last few letters wrong on "rhinoceros", but even the poorest of spellers can make a reasonable stab at almost anything. By contrast, often even the most well-educated Chinese have no recourse but to throw up their hands and ask someone else in the room how to write some particularly elusive character."

- https://pinyin.info/readings/texts/moser.html

Not at all - forgetting kanji just isn't similar to forgetting how to spell English words, as I think TFA made fairly clear. It's the simplest analogy to make, but it's not near enough to draw conclusions from.

The analogy I've used in the past is, you read kanji with your mind but you write them with your hand, so being unable to remember a kanji is more akin to forgetting a guitar chord or a keyboard shortcut - if your hands stop making the motions, you'll eventually forget them.

Most people cannot accurately draw a bicycle.
Yeah - the other analogy I've used is that everyone can recognize a Starbucks logo, but even if you went to the trouble of learning to accurately draw one, you'd forget if you didn't practice.
I am Italian and was taught cursive in elementary school and I can barely remember upper case cursive letters[0] thirty years later.

In my experience, most people of my generation have generally forgot and usually just write "lower case letters but big" or block letters.

So yeah, I don't think there's anything inherently chinese about forgetting writing things you don't use.

[0] https://www.genitorialmente.it/2016/10/alfabeto-corsivo-maiu...

I'm studying Japanese at the moment and what struck me is how important context is, particularly in reading. You need to know where to read 1-3 letters ahead to read a word and interpret it. That's not really a thing in English - a word is a word, and the individual letters that it's composed of are almost always pronounced the same way.

I think digital is a big crutch for Japanese/Chinese because you have input methods that help you write what you want to say, so you don't actually need to remember how to write kanji as much in daily life.

> You need to know where to read 1-3 letters ahead to read a word and interpret it. That's not really a thing in English

It happens in a English too, where you see a chunk of letters and mis-predict which word they represent in a way which affects its meaning [0], and sometimes that will also affect pronunciation. [1]

An example from the link:

> "The complex houses married and single soldiers and their families."

A reader linearly scanning along doesn't know whether "complex" is an adjective or a noun, and then whether "houses" is a noun or a verb. I'm pretty sure all human languages have similar problems where a certain amount of look-ahead or backtracking is necessary.

For another example to highlight pronunciation changes, consider the ambiguity of:

"I saw the rhino live in the zoo."

That could mean that the rhino was doing the verb of living, in which it rhymes with "give", or it could also mean that the speaker was seeing it in-person, in which case it rhymes with "drive".

[0] https://en.wikipedia.org/wiki/Garden-path_sentence

[1] https://en.wikipedia.org/wiki/Heteronym_(linguistics)

seems like an opportune time to also talk about buffalo buffalo buffalo buffalo buffalo buffalo buffalo buffalo.

https://en.wikipedia.org/wiki/Buffalo_buffalo_Buffalo_buffal...

The Chinese equivalent would be the "The story of Mr. Shi Eating Lions":

https://www.yellowbridge.com/onlinelit/stonelion.php

Both rely on intonation (in addition to volume and pauses) for disambiguation, but the fun trick is that in the Chinese version the intonation is an integral part of the lexeme (i.e. it distinguishes between "words").

But I have to say, these kind of sentences (and full-fledged poems) are quite a different beast from simple cases of garden path sentences or syntactic ambiguity[1]. The poem lion-eating poet and the "buffalo buffalo buffalo..." sentence are both highly contrived and unlikely to be understood correctly on the first few goes even with the perfect prosody. They are cool "language hacks", but they do not occur in daily language and I personally believe (although I guess die-hard generative linguists would disagree) that they don't teach us very much about the language itself (except for what are the cool artistic possibilities it opens).

[1] https://en.wikipedia.org/wiki/Syntactic_ambiguity

Incorrect capitalization.
When this happens in English, teachers will label this as "bad English" and ask you to rewrite. That's how the formal language deals with this problem.
If anything, isn't that an informal solution? It relies on other people to complain that they dislike the sentence, without being able to point to any hard-and-fast rule.
The hard and fast rule is that repeating a word right next to itself is generally frowned up. It comes up with “that” a lot, like “he said that, that led to something else”. Sometimes people are doing something clever with the words, but it’s usually just poor English.
Honestly, it rarely happens in English other than in contrived examples used to demonstrate the concept.
Yes, this happens in English too, but to find examples like this you have to go to Wikipedia, or wrack your brain and see if you remember one. In Japanese, almost every other word is like this.

I went to the first link in your comment ( https://en.wikipedia.org/wiki/Garden-path_sentence ), selected the Japanese version of the article, and took the first sentence:

> 袋小路文(ふくろこうじぶん)とは、文法的には正しいけれども、誤読が生じやすい書き出しで始まる文のことである。

As is usual for Japanese, this sentence contains a mix of Chinese(-origin) ("kanji", e.g. 袋 小 路 文 法 的) as well as Japanese phonetic ("kana", e.g. ふくろこうじぶん) characters. Usually, when in a multi-kanji word, kanji are pronounced with (a time-changed version of) Chinese pronunciation. For example, 文法 is "bun-pou", not "fumi-nori" or something else. However, the first character of the article title (fukurokoubunji), 袋, is "fukuro" here despite being in a four-kanji word. Further, 小 is "kou" here, which is nonstandard enough that its dictionary entry does not even list it as a possible pronunciation! [1] Then 路文 are both in Chinese pronunciation (ji-bun), but this does not necessarily make sense because the word is not split in two down the middle, but instead as 袋-小路-文 (bag-lane-sentence, where bag-lane is English cul-de-sac / blind alley). [2]

Now fukurokoubunji is a bit of a specialised word, so it might not be a great example. But in the rest of the sentence, we find 文, which is always pronounced "bun" (sentence) here, even when appearing separately, but could also (though more rarely) have been "fumi" (letter) — nothing but semantical context helps distinguish. Then we have 正しい "tada-shi-i", where 正 could have been "sei" as in 正確 "sei-kaku" (accurate) or "shou" as in 正直 "shou-jiki" (honest), but it isn't just because しい come after. Similarly, 生 in 生じやすい is "shou"(-ji-ya-su-i), which is conjugated from the base form 生じる "shou-ji-ru" and could have been "u" (生まれる "u-ma-re-ru") or "sei" (先生 "sen-sei") or "i" (生きる "i-ki-ru") or more (生 is somewhat infamous for having many readings). And I could go on: 書 could be "syo" (文書 "bun-syo") but is "ka" (書き出して "ka-ki-da-shi-te" conjugated from 書く "ka-ku").

This is a bit like the comments elsewhere here noting that the Chinese word for "sneeze" is a bad example because it happens to have so uncommon characters in it — and then people point to examples like "onomatopoeia" and "diarrhoea" as similar tricky examples in English. I can't comment on Chinese, but existence does not necessarily say much about frequency.

[1]: https://jisho.org/search/%E5%B0%8F%20%23kanji — Kun are the Japanese readings (chiisai, ko, o, sa), and On are the Chinese readings (only "shou" in this case)

[2]: This analysis of 袋小路文 is not completely etymologically honest. By the etymology ( https://en.wiktionary.org/wiki/%E5%B0%8F%E8%B7%AF#Etymology_... ), we see that the "kouji" pronunciation of 小路 is really a corruption of ancient "ko-michi", which is a consistent Japanese-Japanese reading of the two characters. However, because "ji" is also an (uncommon) Chinese reading of 路, if you don't know the etymology of the word, the re-analysis is appropriate in the context of how hard it is to read the written language.

> However, because "ji" is also an (uncommon) Chinese reading of 路,

It's not a Chinese reading at all (as you can tell because it's ... wildly out of place with the the actual Chinese-derived readings ろ・る, onyomi are supposed to have semi-regular correspondences with each other and with Chinese Chinese readings). It's really just rendaku of ち, the basic root of fossilized compound みち (with still-salient prefix "honorific" み).

But most importantly, you never really see either 袋 or 小路 and expect them to have any other readings; maybe you'd expect しょうろ if you don't know the latter, but unless you're already literate in a Chinese or are blindly memorizing kanji tables, the other reading of 袋 (たい) probably isn't even salient, because it's one of those kanji that almost always takes its kunyomi even in compounds.

Side note, the line about u-onbin kind of buries the implication that this is a loanword from western Japanese, which is the culprit of several quasi-systematic but unevenly distributed divergences from regular sound changes.

I stand corrected, you clearly know more about this than I do. :) (I'm only an intermediate learner.)

So perhaps my analysis of 袋小路文 wasn't very accurate at all. Yet I hope my point about 正, 生, 書, etc. stands.

It's only, oh, just about the worst writing system since the Hittites or so, yeah.
> "The complex houses married and single soldiers and their families."

Wow.. I had to read that sentence three times before I got it right.

Maybe because I've seen a similar example used before, but I immediately read it correctly the first time. Honestly these sort of 'problems' only ever seem to occur when specifically created to demonstrate this problem and almost never happen in regular writing.
"I saw the rhino live in the zoo"

Might also mean; "Noted native-American zoologist 'I Saw The Rhino' lives at the zoo"

No it couldn't.
Given shenanigans like "thee stallion" as part of a name... sure it could.
Completely irrelevant. It couldn't because "live" and "lives" are different words.
I agree that Chinese/Japanese has it worse, but any language where "Spelling Bee" is a thing cannot be considered phonetic in a conventional sense.
And yet, given the definition and language of origin, most high-level spelling bee participants can make a pretty good guess at spelling a word they may have never seen before.

English is phonetic, it just borrows its pronunciation rules from many differing (and sometimes directly opposed) other languages.

Very true - and every demonstration of “English is hard to spell/pronounce” focuses directly on the exceptions which exaggerates the problem. One analysis I’ve seen puts it that with a single set of rules, 59% of a sample corpus of 5000 English words can be pronounced perfectly from the spelling (of course, there will be regional accent and dialect differences so that percentage will be a bit different for each one) and up to 85% can be pretty close with only slight errors.

Then there’s a percentage where they’re just direct borrowings from other languages and you need to have an idea of how that language pronounces words (especially French), so really only 10-15% or so of English words end up being true exceptions.

1. https://www.zompist.com/spell.html

> a single set of rules, 59% of a sample corpus of 5000 English words can be pronounced perfectly from the spelling

To do this you need to know 56(!) rules.

I think this actually demonstrates how complex English pronunciation actually is.

And you still only get 59% of the way to the correct pronunciation.

As a non native speaker of English, and a native speaker of a phonetic language, I strongly object to the notion that it's easy to guess English word pronunciation by just reading it.

Those numbers are very bad, given that proper phonemic orthographies can give you a 90+% confidence with far fewer rules.

There's a simple and consistent way to compare languages in this way too, too: train a neural net to map spelling to pronunciation on one half of the dictionary, then test it on the other half. The more complicated and less consistent the orthography is, the more mistakes it'll make. People have in fact done this exact experiment, and English scores extremely poorly in it; for spelling, closer to Chinese, in fact, than many other European languages: https://aclanthology.org/2021.sigtyp-1.1/

Maybe it's the right time to once again quote this poem :

https://jochenenglish.de/misc/dearest_creature.pdf

The joy of English pronunciation

George Nolst Trenit´e (1870–1946)

1 The text

Dearest creature in creation

Studying English pronunciation,

I will teach you in my verse

Sounds like corpse, corps, horse and worse.

I will keep you, Susy, busy,

Make your head with heat grow dizzy;

Tear in eye, your dress you’ll tear;

Queer, fair seer, hear my prayer.

Pray, console your loving poet,

Make my coat look new, dear, sew it!

Just compare heart, hear and heard,

Dies and diet, lord and word.

Sword and sward, retain and Britain

(Mind the latter how it’s written).

Made has not the sound of bade,

Say—said, pay—paid, laid but plaid.

Now I surely will not plague you

With such words as vague and ague,

But be careful how you speak,

Say: gush, bush, steak, streak, break, bleak,

Previous, precious, fuchsia, via,

Recipe, pipe, studding-sail, choir;

Woven, oven, how and low,

Script, receipt, shoe, poem, toe.

Say, expecting fraud and trickery:

1

Daughter, laughter and Terpsichore,

Branch, ranch, measles, topsails, aisles,

Missiles, similes, reviles.

... (7 pages of pain follow) ...

and the the Oxford and US pronunciation (at the time, it has changed since) in phonetic.

Huge difference is: English is pretty much THE language that you can butcher and still have people perfectly understand (and hopefully politely correct) you. Even other European (stay mad) languages don't hold up to just how flexible English is in this regard.
Oh hurrah, I think that link is what I've been looking for for nearly a decade. I ran across it, or something like it, a long time ago and could never find it again. I don't remember all the special syntax, I think the one I found was written more in plain English with more examples (and I don't think the one I found back then mentioned ghoti either), but can't be sure it's been so long - maybe it was just that page and I don't remember it. It does have around the same number of rules I remember though.
This is satire, right? 56 rules to get 59% correct pronunciation on a corpus of 5000 words? And these rules don't even include the base sounds - it doesn't tell you how to actually pronounce "m", or "e". So in fact there are more than 70 rules required to get to a base pronunciation (you need to add at least one rule for each letter).
"ough" has at least 9 different possible pronunciations, how is that phonetic?
>"ough" has at least 9 different possible pronunciations, how is that phonetic?

Does a language stop being phonetic when you have to include other information provided by the rest of the word? I'm not a linguist by any means, but "ough" being pronounced a couple different ways depending how it's used doesn't seem like it'd preclude the language from being considered phonetic in general.

9 is not a couple, unless you're in a very open relationship - which English words might be - but a language stops being phonetic at the point that the mappings between symbols and sounds are no longer clear and reliable. The most phonetic languages have one-to-one mappings with very few exceptions e.g. Japanese, Spanish, Italian, Finnish.

English, on the other hand, has silent letters, inconsistent mappings even within the same word, exceptions, irregularities, and sounds that are represented by multiple letters and spellings.

English is not a phonetic language except in the sense that it does have mappings between sounds and characters, which would make sense if one were to compare it to a wholly written language like Python, but not any human language.

Fruit flies like a banana. English has its own ambiguity, so it isn’t really that different.

I can only write Chinese via an IME these days. For one, I’m left handed so writing characters was always a struggle since stroke order worked against me, but it’s mostly how I only use Chinese anyways.

I told my wife our kid should learn to write via an IME as well and she was just horrified about that, though. None of the teaching material really supports it.

Time flies. I can't they're too fast.
I've been (very) casually learning Japanese for a couple years, and almost every time I think I find something "weird" that Japanese does, I almost immediately think of a very similar example in English.

The alphabet is a pretty awesome invention (alphabet > kana-style syllabary > kanji-style logography) but English writing is at least as complex as JP writing, just in different dimensions.

JP's phonetics, for example, are dead simple compared to English's, but they do a good job making up for it by having a few thousand Kanji.

> JP's phonetics, for example, are dead simple compared to English's

I'm not so sure about that. Do you know about pitch accent?

https://en.wikipedia.org/wiki/Japanese_pitch_accent

I'm not a native English speaker, so I don't really know why, or if, there's a problem for native English speakers to learn or "get" pitch accent. For speakers of many other European languages Japanese pitch accent is not tricky. You listen, and then you speak. Just as you would listen to English, and repeat it the same way.

Japanese, despite being extremely logical and so beautiful in so many ways, is still hard to learn for me, and of course learning the writing system is not done in the blink of an eye (unlike the Latin-based writing system we use), but pitch accent isn't really the problem here.

Is that any more complicated than English stress, though? And regardless, Japanese has a very small number of phonemes (compared to English) and extremely restricted phonotactics.
Yeah, but I don't expect this to be substantively harder than learning most regional accents (could be wrong), and afaik it's also not critical for legibility.
In English you have to know a word in order to pronounce it.

The “ou” diphthong in “hound” and “double” or “would” is pronounced differently. Or “ieu” in “lieutenant” vs “lieu”. Or “oo” in “poor” vs “root” Or “berry” in “berry” vs “strawberry”

I could go on forever. There’s no other western language I know of that behaves like that.

English is a quasi-phonetic language in that most words can be mostly pronounced how they're written, but in some cases it inherits the pronunciation of the language the word came from. I'd imagine many English speakers would consider this an undesirable quirk, though.

Indeed, there has been a tendency over the centuries, particularly in the US, to move towards writing words how they sound or pronouncing words how they're written. Lieutenant is an interesting example, since in the UK we pronounce that "lef-tenant" traditionally, but the US moved to the (IMO superior) "lieu-tenant". Nowadays, most young people would probably use the US pronunciation.

I do take some slight umbrage with the implication that some people seem to be making in this thread that language features can't be criticised or that one language can't be better than another. I'm don't see why this would necessarily be true. Even with spoken languages. There are a ton of annoying aspects to English that simply aren't issues in other languages, and I think it's fair to criticise other languages for their failings too. This is especially true of writing systems, which are human inventions rather than something we learn intuitively.

Logographic/logo-syllabic orthographies are harder to learn and remain proficient at than alphabets/abjads, for native speakers and second language learners alike. Alphabets are an innovation that improved on ancient orthographies and enabled a wider range of people to be able to communicate as easily by writing as they do by speaking. Besides the issue mentioned in the article, the writing systems in China/Japan are associated with other issues we rarely see here. Even dictionaries are a non-obvious challenge with logographic languages, which has resulted in several competing ways to sort words.

I don't think one can reasonably claim that in English "words are mostly pronounced how they're written". I mean, "i" can stand for /i/, /ɪ/, or /aɪ/, for example (and also for /ə/ if you don't count "ir" as a distinct grapheme). Although vowels at least (mostly) follow some predictable patterns based on syllables - but e.g. it's impossible to say whether "ch" stands for /k/, /tʃ/, /ʃ/, or /x/ without knowing the etymology of the word.
Americans pronounce "lieutenant" closer to the native French pronunciation.
> “ieu” in “lieutenant” vs “lieu”

> “berry” in “berry” vs “strawberry”

Am I misunderstanding the point you are making or is my pronunciation just off? I would pronounce both parts of both examples the same.

Strawberry is often pronounced as Strawbry. Sou the 'e' becomes silent. And Lieutenant as Lutenant (or leftenant in Britain)
>Strawberry is often pronounced as Strawbry.

Only in some dialects, not in the standard form.

French can be pretty bad. Not as bad as English for reading, but it's much worse for writing because there are so many spelling options for the same thing.
You are right, but you can read French words without knowing the language, because a written word has a unique correct pronunciation.
You have the right idea on “ou” but your other examples don’t make sense.
>That's not really a thing in English - a word is a word, and the individual letters that it's composed of are almost always pronounced the same way.

https://en.m.wikipedia.org/wiki/Ough_(orthography)

> in English - a word is a word, and the individual letters that it's composed of are almost always pronounced the same way

Are you sure about that?

https://en.wikipedia.org/wiki/Ghoti

Posted up above, here's a collection of English pronunciation rules that English speakers have internalized so well they can't generally explain them: https://www.zompist.com/spell.html

"Ghoti" is mentioned a few times there, but basically "fish" is a nonsensical pronunciation that breaks several rules. There's a reason (well, a few reasons) why if you ask English speakers how to pronounce "ghoti" and they've never seen it before, they'll probably all guess some variation of "go-tee" or "go-tie".

That's such a dumb example because it claims to follow english rules for those letters while ignoring the actual rules. It makes a somewhat humorous joke, but people pretending that it means anything linguistically are either ignorant or intentionally trying to confuse people.
shure!
reads like it would be pronounced with an aspirated -s- not sh.
Not so much in terms of meaning but in terms of pronunciation, sometimes you also need to read ahead in English to know how a certain word is pronounced. For example: "I read a book yesterday." and "I read a book every night." Depending on the context that follows, "read" is pronounced differently. The same thing happens for "present" and "record". Admittedly, these are exceptions to the rule.
When teaching reading and English, learning about context clues is one of the ways students are taught to figure out the meaning of words.
> in English - a word is a word, and the individual letters that it's composed of are almost always pronounced the same way

Some context-dependent examples: "read": /ɹid/ vs. /ɹɛd/; "lead": /lid/ vs. /lɛd/ (plumbum); "desert": /ˈdɛz.ɚt/ vs. /dɪˈzɝt/.

I think you’ll find all of those things are true of English too.
People find value in the tradition of writing. If Japanese were to ditch kanji as Korea did, I think there would be some complaints.
People love to complain about how much work other people can do in order to slightly convenience themselves. And the media loves to run air their complaints, because they are snappy, and photogenic, and easy to pitch as "feel good stories about how much I care for the old ways unlike those lazy sloppy people over there"... even if "I" also find myself forgetting how to write kanji.

It doesn't matter. It won't be a top-down decision. It'll just be a long, slow progression of people slowly realizing that writing in kanji for this character is annoying, so maybe I'll just write it phonetically, and then that character, and then there will be a year or two where there's a phase change and suddenly it's everywhere, even though nobody decided.

And people will complain and whine and moan about the "beauty" of the kanji disappearing. And even though they have a point, it won't matter because the kanji will still be there as much as they ever were, and all one has to do is go study them... but they won't. Because complaining about how other people should keep doing something hard is easy, but actually doing the hard thing yourself is hard, and the vast, vast majority of the complainers won't actually do anything about it other than complain, but take the easier options themselves, just maybe a year or two later than others.

I have no beef with the people taking the easier option. Life is full of things to spend effort on and we can't give maximum effort to all of them. I am annoyed at people who complain about how other people can do vast, vast quantities of work so they can briefly feel slightly better about themselves in some way.

You'd be surprised how much staying power these things have. The nation and its language are two concepts inherently intertwined. Take the case of Welsh in Wales. It was an almost dead language that no one spoke, but as soon as the Welsh got the ability to self-govern, they enacted laws to mandate all documents and road signs were available in Welsh, required it to be taught in schools, etc. It's very difficult to kill a language in a democratic state because it's a very bad look to oppose laws that "protect the nation's culture". The people who want these things are endlessly pandered to as a result.
Welsh is generally highlighted as the example of a successful language revitalism movement, but it's also one of the rare examples of such movements succeeding. By contrast, you can look at Irish--where the need for the language that wasn't English was seen as absolutely essential as part of the (successful) revolution and independence movement--and see that the language revitalism there is more or less a failure. A century after independence, the number of L1 speakers of Irish has gone down, and I believe the Irish government still conducts most of its business using English (despite English officially being the lesser of the official languages) since so few members of government are sufficiently proficient in Irish.
Kanji is never going away. I struggle to believe anyone who fully mastered kanji would say something like this.

Kanji is not just “harder”. It’s better.

I've studied kanji to some degree. I'm not a "master", but I am aware of the way it resolves a lot of ambiguities in Japanese.

But that does not on its own mean that Japanese couldn't evolve out of Kanji. It is not the case that if Kanji goes away, the entire rest of the language MUST stay static. It in fact would not. It would begin a multi-decade process of adjustment to the new issues.

It has happened before in other contexts, and it will happen again. There's a lot of signs that Chinese is on the verge of such a change (on a decadal time scale), which carries somewhat different baggage, but roughly the same amount of it.

What really throws the wrench into the whole thing is computers, and I don't just mean that it will simply speed up or slow down such a change, but that it could send all of this flying out in an entirely new direction. If we're all wearing augmented reality goggles full time in 20 years, what will happen to ideograms if every ideogram you see comes with floating pronunciation guides, and your googles can also translate phonetic spellings transparently in real time back into kanji/ideograms? Could languages like English start growing something like ideograms, presumably descended from modern-day emoji, if computers erase the disadvantages of emoji that cause languages to largely go alphabetic thousands of years ago?

What I absolutely do know is this: In 50 years, no language will be the same as it is today. Guessing what the changes will be, especially in a rapidly evolving novel landscape, is really hard. I don't think kanji/ideograms being seriously diminished is off the table.

Why is this if you don't mind me asking? I thought that hiragana could already write all the words. What makes kanji so much better than that?
In addition to the phoneme problem, it's about readability. Yes, really. The first time I saw わたし written as 私 I just about instantly remembered the latter (it is, after all, used constantly in writing). That kanji is much easier and faster to read than the corresponding hiragana, and it was like that from way back when I had just started learning Japanese. I still have a way to go.. learning a language at my age turns out to be quite slower than when I was younger.. but everything is just easier to read, as soon as one's able to read something in kanji instead of hiragana. The latter is hard and slow to read, even though it's such a simple character system to learn.
Nah as someone that learnt it for 3 years, did a 6 month exchange and then stopped after that I totally disagree.

Not only are kanji needlessly complex because of history, there's also extra work like stroke order (another needlessly "important" thing).

Hira/kata is so much easier, but I ended up giving up the language after I both realised that I wouldn't live there and that they're just romanising so much anyways.

Japanese is very syllable-poor and so there are a colossal number of homonyms and homophones. In speech a lot of these are distinguished by tone and pronunciation, but in writing kanji is the only way to tell them apart. Reading kana-only Japanese is not impossible, but it's a fast path to a headache and leads to huge numbers of ambiguities even in the best case.
This just indicates that kana orthography is not phonemic enough; but there's no reason why it couldn't be improved to cover tones etc.
I can't help but feel these languages are just silly, or at least very badly designed. Maybe in the future, when AI is good enough to translate everything in real time, we will just find a language that is best and teach children that instead. It would save a lot of headaches, and probably also cure dyslexia.
This is not a reason for Japanese people to keep Kanji, but Chinese tourists can read Japanese at about 50% comprehension level just due to Kanji without knowing at all how the words are pronounced in Japanese.
Another commenter pointed out the ambiguity in Japanese phonetics which is very true.

Imo, the biggest efficiency gain from kanji comes from reading. Meaning is grasped instantly because you don’t need to worry about phonetics. Pronunciation follows a general set of rules, such that even when encountering new words you can guess at how they’re pronounced, while grasping meaning at a glance.

To compare it to latin languages, the difference is like going from reading everything out loud to reading silently.

How does pronunciation follow any rules? There are none that I know of where a given kanji can have several meanings completely independent of one another, there is no structure there.

I'd agree with you if you'd said Korean, where the makeup of the character has direct rules for pronouncing it, if you learn the simple rules then you can read any Korean character - this is the middle ground they should drop kanji for, imo

Sure, and there were complaints in Korea, too. Lest we forget, Hangul was developed in 15th century, and was promptly condemned by the educated elites while being enthusiastically adopted by the underclasses. But the elite pushback, going as far as outright bans in some periods, meant that it wouldn't become the standard orthography until 1900s.

I don't think anyone today would seriously argue that Hanja is preferable, though. In retrospect, it's clear that the benefits of easily accessible universal literacy are too substantial to ignore for the sake of tradition.

> I don't think anyone today would seriously argue that Hanja is preferable

It's necessary to use Hanja today in educated contexts because Hangul has too many homophones, and most educated (technical, literary, scientific) vocabulary has a Sinitic origin and therefore are more homophonic than typical Korean words.

Sure, and lawyers in English-speaking countries similarly use Latin and Old French jargon to reduce ambiguity. But this is a fairly narrow use case that is really more of a specialized notation - it's not used day-to-day even by people who regularly use Hanja professionally.
Hanja still get used in some contexts --- had to memorize ~500 of them when I was studying Korean.
AFAIK (maybe someone can correct or confirm) it is essential for studying law in Korea. To avoid ambiguity with identically sounding words, Chinese characters are used in law.
This is the reason Chinese characters are not going away. It is essential to comprehending written documents, because the Chinese language (and similars) have too many sounds that are the same or very similar for different words. So, if they abolish the characters and use something purely phonetic they'll have to reinvent the whole language to be understandable, especially for anything that is not colloquial.
This is not a problem in other languages. The word "set" in English has 7 different meanings, yet you rarely struggle to tell which is intended. If the language can be understood when spoken, it can be understood when written phonetically.
Other languages are not Chinese. In Chinese a lot of the meaning in the spoken language is conveyed through tones and other conversational cues.
>This is not a problem in other languages.

I don't have a dog in this fight one way or the other, but it really is surprising that all these pro-kanji comments seem to ignore the concept of context altogether. It's very circular reasoning being used to try and explain why kanji are necessary.

> they'll have to reinvent the whole language to be understandable

Frankly, the whole language seems like such a mess that maybe they should?

Good luck convincing 1.5 billion people that they need to reinvent a language they have used for thousands of years in order to satisfy somebody else...
It seems there's room for "legal innovation" there, by providing definitions early on in various texts to disambiguate, and then sticking to them throughout the text!?

I assume it's already done anyway for some terms. Why isn't this more widespread?

Innovation is quite often resisted by those who have mastered the hard way of doing things, though I have no idea whether this is the case here.
I suppose for the same reason that law in English-speaking countries still uses so much Latin?
But that's the opposite of innovation? Basically, instead of describing things in detail drafters opt to use shortcuts, but that's how people end up getting fucked in court by some "technicality".

Innovation would be to just put in the verbiage, precisely define terms, fuck tradition.

The Latin (and Old French) words don't require a complicated arrangement to type them on a keyboard.
Yeah, because Japanese without Kanji is at least 2x harder to read.
"Although East Asian people have higher IQ scores on average, we are not superhuman" weird explicit racism as the highest voted comment
Before calling it racist we could ask him for a reference. After all, it might be a truthful fact.

But yes, it stood out to me too, and I'm confused how you're the only person commenting on it.

I wonder if these people justify having a shitty writing mechanism by being smart. "It's so needlessly complex, but we're smart so we can afford it" when in reality if you're smart you want it as simple as possible.

And then he comes on HN and rationalizes the fact that he can't spell. Ironic.

IQ has been disproved as not an accurate evaluation of intelligence even for problem solving, so asking for a scientific reference for this is like asking a scientific reference for how a certain locally dominant ethnicity have better chakras than another one. they don't, it's just racist.
The absolute state of orange reddit
I found the "we are not superhuman" bit annoying and condescending - replacing with different groups:

"While men are stronger than women on average, we are not superhuman" <- To me, this doesn't seem condescending. Not 100% sure why - perhaps because the difference in the trait (between men and women) is much larger?

"While left-leaning people are smarter than right-leaning on average, we're not superhuman" <- This definitely does seem condescending - perhaps because the skill in question is intelligence, and the statement reads as "You're stupider than me, but please strain your brain to understand".

"While rich people have higher IQ scores on average, we're not superhuman" <- I find this a bit less condescending than the previous one, not sure why. But still annoying for the same reason as previous.

"While whites are stronger than east asians on average, we're not superhuman" <- Again condescending and annoying, but I still think the intelligence statements are more grating.

My conclusion - the statement is irritating because it carries with it an implication people would be surprised Asian people can forget things too. Additionally, it gives the intention that because other groups are stupider, it needs spelling out in simple terms that they're not godly intellects.

Maybe it's less of a factor since the standardization of mandarin, but the difference between kanji and an alphabet like Korean and Vietnamese has moved to is that writing with the alphabet leaves an artifact that is only understood by speakers of the same language, whereas kanji can have the same meaning but different spoken words entirely, such that cultures can communicate through written edicts without totally erasing linguistic differences through standardization. So you're right that the individual language/culture doesn't suffer from alphabetization or pinyinification, but I would submit there is change on the level of multicultural interactions, decreasing the mutual intelligibility between cultures for better or worse
> Although East Asian people have higher IQ scores on average

Citation needed.

Maybe parent was referring to the studies by Lynn, a self-declared "scientific racist".

I think that despite lower IQ scores on average South Korea has been consistently beating Japan in go in the recent years, and more importantly they get rid of hanja (Korean version of kanji) from their writing system.

Here is one citation: https://www.sciencedirect.com/science/article/abs/pii/S10416...

But also in most other IQ tests, Ashkenazi Jewish and East Asian people tend to score the highest.

a) Bad science. b) Unsupported claim. c) Who cares, and why?

As an Ashkenazi Jew with an IQ 3 SD above the mean myself, I focus my attention on that question and have a good grasp of the answer. I also have particular insight into why some Jews have high scores, and how the people who care so much about the average IQs of various populations draw all the wrong conclusions from them because of their ideology. (I would also note that many of those who care so much have lower IQs than a very large fraction of the populations they disparage.)

"While the author seems to interpret this problem as something crucial"

Does he? Read his last paragraph.

My wife is college educated and native Korean, so these are just my observations of her and her friend group's engagement with Chinese characters (Hanja).

Hanja, in daily life, has largely disappeared from colloquial Korean for those under 40 or so. It's still preserved in some formal settings like medicine and law, and is used to appeal to older generations. I've been with my wife long enough to remember when Hanja was still very common to see on newspapers.

There are some small vestigial problem with eliminating from daily life, the large number of monosyllabic Chinese-origin loan words in modern Korean can sometime create ambiguity when written in Hangul. Native Korean speakers will sometimes disambiguate these words by referring to the Hanja, but that's largely disappearing as a habit as well.

Younger Korean generations still learn it in K-12, but it's mostly wasted class time in an already overly crammed education. The kids who focus on it are really geared towards becoming lawyers, and certain kinds of doctors (mostly traditional medicine). STEM focused kids will focus on English instead. As a result there's an active linguistic process occurring where English loan words are slowly replacing Chinese-origin words and concepts in active and modern Korean.

I don't too much about Japanese, but I do have a sense from native speakers that writing the same words in the four major writing systems offers some sense of nuance to how close a reader might be to a concept, or how they might consider it in various ways. From visits there, I did notice the expectation that native speakers could seamlessly read and jump between the systems, often within the same sentence. But I also understand that the pronunciation of Kanji is somewhat nonstandard, and it's not immediately clear how to say something written purely in Kanji (sometimes this is supported by providing explanatory superscripts in another system next to the Kanji). Why persist with this? I suppose it's the nuance that's being conveyed, and this nuance is still prized among native Japanese speakers.

I do get the sense that China has no particular plans on moving away from the system, as it's a unifying source of national identity (and has been for centuries). And they really have very few other options. The main problem is that China is a highly linguistically diverse country, and Chinese offers the ability to transmit ideas instead of sounds which allows speakers of non-mutually-intelligible "dialects" to communicate. Moving to a Latinate system or even to Zhuyin Fuhao (Bopomofo) encodes sounds, not ideas, and risks fracturing the state. It would only become possible if there was a concerted effort, maybe over a couple generations, to Mandarinize and discourage the use of local dialects, but that would also be highly disruptive. Koreans, Japanese (and other adjacent non-Sino languages like Vietnamese, etc.) escaped this either through a higher level of linguistic uniformity, or strong efforts to standardize or teach a national dialect that the writing system (Hangul, Chữ Quốc ngữ, Hiragana, etc.) could amplify.