Hacker News new | ask | show | jobs
by danabramov 3 days ago
The article literally says:

> there is no "si" in the hiragana table, so s_ + (i) = shi. […] this is why it's important that you don't actually "think in" romaji. […] i'm using romaji as a convenient way to refer to phonetics in text. however, your "mental algebra" should match the hiragana table.

1 comments

That’s just false, “si” is in the hiragana table as し. The romanization “si” is /si/ which is pronounced [ɕi] (or [ɕi̥] or some other possibility). This is basic Japanese phonetics.

If you fix all the errors that are in the article, at best there is an argument buried here that Hepburn romanization should not be used to teach Japanese to English speakers—but I think that point is really my own argument that I’m making with the fragments of the article that make sense.

Romanization can be more consistent with Japanese phonetics or it can be more consistent with English phonetics, and the Hepburn romanization is more consistent with English phonetics, which is why it’s a good choice for English speakers that don’t know Japanese, but a bad choice for English speakers who are trying to learn Japanese.

Okay, we’re fighting over definitions here. There is no “si” in Hepburn romanization. I am intentionally using Hepburn romanization in the article. Therefore, in my article “si” is a compile error.

You may argue with my choice, or maybe you can argue that referring to cells in Hiragana table solely by my chosen romanization is somehow bad, and I should instead be inconsistent and give the same mora two different romanizations within a single article. Is that what you’re suggesting?

First, As a basic part of Japanese language education, students are expected to be familiar with different romanization systems. If you ask a student where “si” is in the table, they should be able to find it. If a student says “it’s not in the table” then they’ve failed the lesson or there is something wrong with the teaching material.

Second: am I arguing that the choice of using Hepburn here is somehow bad? Yes, that’s correct. I think Hepburn is a bad choice here. A good choice is Nihon-shiki. JSL romanization is also fine.

We’re talking about the same thing but you insist that there is only one angle under which things aren’t confusing. I disagree. That’s fine. The two systems are isomorphic, and I genuinely believe that, given I’ve described every single caveat of Hepburn in the article, I’ve paid my dues for using it. YMMV. I even include the “finding in the table” part.

I think I agree that Nihon-shiki and explaining it upfront would’ve made the article more elegant. One constraint I wanted to hit is that a person should be able to read this article with zero knowledge of Japanese, and walk away with being able to conjugate almost every verb to every suffix correctly. This is more of a challenge to myself as a writer than any practical need but hope it shed some light on the choices and the framing. I liked Hepburn because it’s closer to how it sounds. You can imagine I’m using IPA instead if you want.

The systems are obviously not isomorphic—Japanese kana are not entirely phonetic (they are just mostly so) and the different romanization systems choose differently whether to follow orthography or phonetics more closely.

> hanas* + (i)masu = hanasimasu (wrong!)

I cannot wrap my head around how this line in the article could be defensible. Like, if I don’t understand how Japanese is pronounced or written, and I just rely on Hepburn, I guess pasting these fragments of Hepburn together don’t produce the right Hepburn in the end?

YMMV indeed, but I think the lesson here is “this is why you don’t use Hepburn when you’re writing an article about Japanese verb conjugations”.

Hepburn does make sense for somebody with zero knowledge of Japanese but it just gets in the way when you are trying to explain how Japanese works. So lesson zero is “don’t rely on Hepburn” and IMO if you are interested in pronunciation and listening you should be using audio as your primary source.

I’m saying that Hepburn is isomorphic to Nihon-shiki since each is an encoding of kana. Each of them is a bijection to kana (actually that’s wrong; see EDIT below), therefore there’s a bijection between them. Obviously I’m not saying that arbitrary latin characters are isomorphic to kana, that would make zero sense.

I sympathise with your point about the benefits of Nihon-shiki romanization here. It might’ve been a better choice for this article.

> I cannot wrap my head around how this line in the article could be defensible

I think the reader would just read the next section where I use your argument to critique my own approach? And then make up their own mind whether it’s defensible to do something in the article, to raise pros/cons for why I did it, and then to keep on with the choice.

I wanted to illustrate this confusing point, and that’s how I chose to illustrate it. I think it’s confusing either way. I trust that a reader who actually wants to learn, and isn’t just being a pedant, would carry away the right set of conclusions, and would understand the isomorphism (again — see EDIT below) after those two sections.

> Like, if I don’t understand how Japanese is pronounced or written, and I just rely on Hepburn, I guess pasting these fragments of Hepburn together don’t produce the right Hepburn in the end?

Yeah. So that’s a learning opportunity that kana row shifting doesn’t quite follow rules you might expect from many other languages. Maybe that’s a clunky way to introduce it. I personally like this framing. As I noted somewhere else, you could imagine that I’ve chosen IPA notation instead.

EDIT: Actually wait, Hepburn is not bijective for zu and ji. I haven’t thought about that. It’s not relevant to any of the conjugations so it doesn’t break the article, but that may be a good argument that it’s not worth the effort rescuing Hepburn.

The problem is that your explanation confuses phonemes with letters.

A spoken language is described by decomposing the spoken words into phonemes, where phonemes are sounds that distinguish words, in the sense that replacing one phoneme in a word with another phoneme will produce a different word.

While ideally each phoneme should be recognized by a distinct pronunciation, in the majority of the languages of the world a phoneme does not have a single pronunciation, but it is pronounced in different ways, depending on the context.

It does not matter at all how one chooses to write a Japanese word, with hiragana or with one of the various methods of romanization. For any writing system, you must know the correspondence between phonemes and how they are written. For very few writing systems there is a one-to-one mapping between phonemes and letters.

The Hepburn romanization does not attempt to be a phonemic writing system, but it attempts to be close to a phonetic writing system from the point of view of an English speaker. The Kunrei-shiki romanization attempts to be closer to a phonemic writing system than to a phonetic writing system. I my opinion a phonemic writing system is superior to a phonetic writing system, but it appears that for most English speakers it has been too difficult to understand the difference between such writing systems, so the Japanese government eventually gave up and they switched to Hepburn, to please the less sharp-witted English-speaking visitors.

Japanese has an "s" phoneme, which happens to be pronounced differently before the vowel "i" than before the other vowels, and before "i" it is pronounced similarly to an English "sh".

In the same way, the Japanese phoneme "t" is pronounced before "i" similarly to an English "ch".

Once you know these two rules, and the few other rules about the other Japanese phonemes whose pronunciation depends on the context, like "n" becoming "m" before "b", there is no point in mentioning them again.

In your discussion about conjugation there is nothing exceptional about the variations in pronunciation that are reflected in the Hepburn Romanization. They are just the general rules of Japanese pronunciation, like for any other words.

So any discussion about these spelling variations is misplaced in the discussion about conjugation, where it occupies a space without contributing anything to the understanding of the conjugation rules.

Otherwise, I think that your article is fine.

I get that. I’ve compressed this as an aside in an article about something else. It’s a choice; like an article about React could spend a bit of time on arrow functions vs function declarations as a choice. Or even let/var. It all depends on audience’s prerequisite knowledge and what you choose to assume.

I strongly suspect that if I were using Kunrei-shiki, there would be just as many comments here saying my article is wrong because “si” is pronounced closer to English “shi”, but my article makes it seem like it doesn’t — so this is why you should learn kana bla bla bla.

I assume my reader (1) has zero prerequisites and (2) wants words to sound correctly while seeing them the first time. Those are the constraints that motivated my approach. You could argue that it’s a strange set of constraints to pick when teaching but I wanted it to be fun.

I've looked at over a dozen hiragana tables and they all use Hepburn romanization.

Obsessing over romanization, something that a student ought to outgrow, is a sure fire way for a student to get overwhelmed by irrelevant details that discourage learning. The hard part is putting in the work, not learning less than a dozen exceptions.

Hepburn is the official romanization chosen by the Japanese government (it's a relatively recent change), kunrei-shiki has been deprecated and all the signs etc are in the process of being converted to Hepburn.