Hacker News new | ask | show | jobs
by klodolph 3 days ago
That’s just false, “si” is in the hiragana table as し. The romanization “si” is /si/ which is pronounced [ɕi] (or [ɕi̥] or some other possibility). This is basic Japanese phonetics.

If you fix all the errors that are in the article, at best there is an argument buried here that Hepburn romanization should not be used to teach Japanese to English speakers—but I think that point is really my own argument that I’m making with the fragments of the article that make sense.

Romanization can be more consistent with Japanese phonetics or it can be more consistent with English phonetics, and the Hepburn romanization is more consistent with English phonetics, which is why it’s a good choice for English speakers that don’t know Japanese, but a bad choice for English speakers who are trying to learn Japanese.

1 comments

Okay, we’re fighting over definitions here. There is no “si” in Hepburn romanization. I am intentionally using Hepburn romanization in the article. Therefore, in my article “si” is a compile error.

You may argue with my choice, or maybe you can argue that referring to cells in Hiragana table solely by my chosen romanization is somehow bad, and I should instead be inconsistent and give the same mora two different romanizations within a single article. Is that what you’re suggesting?

First, As a basic part of Japanese language education, students are expected to be familiar with different romanization systems. If you ask a student where “si” is in the table, they should be able to find it. If a student says “it’s not in the table” then they’ve failed the lesson or there is something wrong with the teaching material.

Second: am I arguing that the choice of using Hepburn here is somehow bad? Yes, that’s correct. I think Hepburn is a bad choice here. A good choice is Nihon-shiki. JSL romanization is also fine.

We’re talking about the same thing but you insist that there is only one angle under which things aren’t confusing. I disagree. That’s fine. The two systems are isomorphic, and I genuinely believe that, given I’ve described every single caveat of Hepburn in the article, I’ve paid my dues for using it. YMMV. I even include the “finding in the table” part.

I think I agree that Nihon-shiki and explaining it upfront would’ve made the article more elegant. One constraint I wanted to hit is that a person should be able to read this article with zero knowledge of Japanese, and walk away with being able to conjugate almost every verb to every suffix correctly. This is more of a challenge to myself as a writer than any practical need but hope it shed some light on the choices and the framing. I liked Hepburn because it’s closer to how it sounds. You can imagine I’m using IPA instead if you want.

The systems are obviously not isomorphic—Japanese kana are not entirely phonetic (they are just mostly so) and the different romanization systems choose differently whether to follow orthography or phonetics more closely.

> hanas* + (i)masu = hanasimasu (wrong!)

I cannot wrap my head around how this line in the article could be defensible. Like, if I don’t understand how Japanese is pronounced or written, and I just rely on Hepburn, I guess pasting these fragments of Hepburn together don’t produce the right Hepburn in the end?

YMMV indeed, but I think the lesson here is “this is why you don’t use Hepburn when you’re writing an article about Japanese verb conjugations”.

Hepburn does make sense for somebody with zero knowledge of Japanese but it just gets in the way when you are trying to explain how Japanese works. So lesson zero is “don’t rely on Hepburn” and IMO if you are interested in pronunciation and listening you should be using audio as your primary source.

I’m saying that Hepburn is isomorphic to Nihon-shiki since each is an encoding of kana. Each of them is a bijection to kana (actually that’s wrong; see EDIT below), therefore there’s a bijection between them. Obviously I’m not saying that arbitrary latin characters are isomorphic to kana, that would make zero sense.

I sympathise with your point about the benefits of Nihon-shiki romanization here. It might’ve been a better choice for this article.

> I cannot wrap my head around how this line in the article could be defensible

I think the reader would just read the next section where I use your argument to critique my own approach? And then make up their own mind whether it’s defensible to do something in the article, to raise pros/cons for why I did it, and then to keep on with the choice.

I wanted to illustrate this confusing point, and that’s how I chose to illustrate it. I think it’s confusing either way. I trust that a reader who actually wants to learn, and isn’t just being a pedant, would carry away the right set of conclusions, and would understand the isomorphism (again — see EDIT below) after those two sections.

> Like, if I don’t understand how Japanese is pronounced or written, and I just rely on Hepburn, I guess pasting these fragments of Hepburn together don’t produce the right Hepburn in the end?

Yeah. So that’s a learning opportunity that kana row shifting doesn’t quite follow rules you might expect from many other languages. Maybe that’s a clunky way to introduce it. I personally like this framing. As I noted somewhere else, you could imagine that I’ve chosen IPA notation instead.

EDIT: Actually wait, Hepburn is not bijective for zu and ji. I haven’t thought about that. It’s not relevant to any of the conjugations so it doesn’t break the article, but that may be a good argument that it’s not worth the effort rescuing Hepburn.

Because of this incomplete bijectivity, some applications, e.g. dictionaries, use a modified Hepburn system, which is bijective, e.g. by using "dzu" and "zu", instead of Hepburn "zu" (and "dji" and "ji").

Hiragana also has its problems, because the hiragana used before WWII corresponded with an ancient pronunciation of Japanese, from many centuries ago, which no longer matched the modern Japanese pronunciation.

After WWII, under the American occupation, there was a reform of the writing system, which replaced many kanji used before WWII and it also changed the spelling in hiragana of many words.

In general the modern hiragana spelling has been changed to match the modern pronunciation, but there are a few survivals of the older spelling that lead to inconsistencies.

As an example, the hiragana syllable now romanized as "ha" was pronounced for some time several centuries ago as "fa-" in initial syllables and as "-va-" in internal syllables. Then the pronunciation shifted to "ha-" in initial syllables and to "-wa-" in internal syllables. After WWII the "ha" hiragana character was replaced by the "wa" character in most internal syllables, to match the new pronunciation, except in the "-wa" postposed particle, where the "ha" hiragana character was retained, despite the pronunciation. The particle is now romanized as "wa", so going backwards to hiragana would produce the wrong hiragana character, another example of non-bijectivity, besides "zu" and "ji". Yet another non-bijectivity example is that the postposed particle normally romanized as "-o" actually uses the hiragana character "wo".

The changes in hiragana spelling after WWII are also responsible for the fact that many Japanese words reproduced in old books written in English, e.g. from the 19th century, appear quite different from how they are written today in the modern Hepburn romanization.

> I think the reader would just read the next section where I use your argument to critique my own approach? And then make up their own mind whether it’s defensible to do something in the article, to raise pros/cons for why I did it, and then to keep on with the choice.

I think that’s a long wait; I don’t want to rely too heavily on analogies but it is like teaching somebody arithmetic roman numerals and then explaining in a parenthetical that there are other ways to do arithmetic (but not naming them). Maybe the reader can make up their own mind—but I don’t think the pros and cons are raised in the article, or if the are raised, I couldn’t find it.

I don’t want to pile on here but it sounds like you are, in this conversation, learning about why the different romanizations exist and what the pros and cons are. Or if you already knew, you are getting what they call an object lesson. (Like you noted—in Hepburn, ji and zu correspond to two different kana each.)

> As I noted somewhere else, you could imagine that I’ve chosen IPA notation instead.

This just resurfaces a similar problem with different symbols—if you put your IPA notation in slashes // you get phonemes, which will get you something mostly equivalent to Kunrei-shiki romanization. If you put your IPA in brackets [] then you get something sort of equivalent to Hepburn (in that it’s designed to show pronunciation). Both choices will on some level obscure a regular pattern that could be revealed with kana or romaji. Orthography is funny like that; in both Japanese and English it can show the origin of words even when the pronunciation changes.

I think the other lesson here is that students will mostly learn morphophonology intuitively by absorbing examples with some light explanations of the rules, and if you overexplain the rules you end up with too much “scaffolding” which gets in the way. Like when people use mnemonics or try to memorize kanji by thinking pictorially.

The problem is that your explanation confuses phonemes with letters.

A spoken language is described by decomposing the spoken words into phonemes, where phonemes are sounds that distinguish words, in the sense that replacing one phoneme in a word with another phoneme will produce a different word.

While ideally each phoneme should be recognized by a distinct pronunciation, in the majority of the languages of the world a phoneme does not have a single pronunciation, but it is pronounced in different ways, depending on the context.

It does not matter at all how one chooses to write a Japanese word, with hiragana or with one of the various methods of romanization. For any writing system, you must know the correspondence between phonemes and how they are written. For very few writing systems there is a one-to-one mapping between phonemes and letters.

The Hepburn romanization does not attempt to be a phonemic writing system, but it attempts to be close to a phonetic writing system from the point of view of an English speaker. The Kunrei-shiki romanization attempts to be closer to a phonemic writing system than to a phonetic writing system. I my opinion a phonemic writing system is superior to a phonetic writing system, but it appears that for most English speakers it has been too difficult to understand the difference between such writing systems, so the Japanese government eventually gave up and they switched to Hepburn, to please the less sharp-witted English-speaking visitors.

Japanese has an "s" phoneme, which happens to be pronounced differently before the vowel "i" than before the other vowels, and before "i" it is pronounced similarly to an English "sh".

In the same way, the Japanese phoneme "t" is pronounced before "i" similarly to an English "ch".

Once you know these two rules, and the few other rules about the other Japanese phonemes whose pronunciation depends on the context, like "n" becoming "m" before "b", there is no point in mentioning them again.

In your discussion about conjugation there is nothing exceptional about the variations in pronunciation that are reflected in the Hepburn Romanization. They are just the general rules of Japanese pronunciation, like for any other words.

So any discussion about these spelling variations is misplaced in the discussion about conjugation, where it occupies a space without contributing anything to the understanding of the conjugation rules.

Otherwise, I think that your article is fine.

I get that. I’ve compressed this as an aside in an article about something else. It’s a choice; like an article about React could spend a bit of time on arrow functions vs function declarations as a choice. Or even let/var. It all depends on audience’s prerequisite knowledge and what you choose to assume.

I strongly suspect that if I were using Kunrei-shiki, there would be just as many comments here saying my article is wrong because “si” is pronounced closer to English “shi”, but my article makes it seem like it doesn’t — so this is why you should learn kana bla bla bla.

I assume my reader (1) has zero prerequisites and (2) wants words to sound correctly while seeing them the first time. Those are the constraints that motivated my approach. You could argue that it’s a strange set of constraints to pick when teaching but I wanted it to be fun.

I've looked at over a dozen hiragana tables and they all use Hepburn romanization.

Obsessing over romanization, something that a student ought to outgrow, is a sure fire way for a student to get overwhelmed by irrelevant details that discourage learning. The hard part is putting in the work, not learning less than a dozen exceptions.

Hepburn is the official romanization chosen by the Japanese government (it's a relatively recent change), kunrei-shiki has been deprecated and all the signs etc are in the process of being converted to Hepburn.