Hacker News new | ask | show | jobs
by klodolph 3 days ago
The systems are obviously not isomorphic—Japanese kana are not entirely phonetic (they are just mostly so) and the different romanization systems choose differently whether to follow orthography or phonetics more closely.

> hanas* + (i)masu = hanasimasu (wrong!)

I cannot wrap my head around how this line in the article could be defensible. Like, if I don’t understand how Japanese is pronounced or written, and I just rely on Hepburn, I guess pasting these fragments of Hepburn together don’t produce the right Hepburn in the end?

YMMV indeed, but I think the lesson here is “this is why you don’t use Hepburn when you’re writing an article about Japanese verb conjugations”.

Hepburn does make sense for somebody with zero knowledge of Japanese but it just gets in the way when you are trying to explain how Japanese works. So lesson zero is “don’t rely on Hepburn” and IMO if you are interested in pronunciation and listening you should be using audio as your primary source.

1 comments

I’m saying that Hepburn is isomorphic to Nihon-shiki since each is an encoding of kana. Each of them is a bijection to kana (actually that’s wrong; see EDIT below), therefore there’s a bijection between them. Obviously I’m not saying that arbitrary latin characters are isomorphic to kana, that would make zero sense.

I sympathise with your point about the benefits of Nihon-shiki romanization here. It might’ve been a better choice for this article.

> I cannot wrap my head around how this line in the article could be defensible

I think the reader would just read the next section where I use your argument to critique my own approach? And then make up their own mind whether it’s defensible to do something in the article, to raise pros/cons for why I did it, and then to keep on with the choice.

I wanted to illustrate this confusing point, and that’s how I chose to illustrate it. I think it’s confusing either way. I trust that a reader who actually wants to learn, and isn’t just being a pedant, would carry away the right set of conclusions, and would understand the isomorphism (again — see EDIT below) after those two sections.

> Like, if I don’t understand how Japanese is pronounced or written, and I just rely on Hepburn, I guess pasting these fragments of Hepburn together don’t produce the right Hepburn in the end?

Yeah. So that’s a learning opportunity that kana row shifting doesn’t quite follow rules you might expect from many other languages. Maybe that’s a clunky way to introduce it. I personally like this framing. As I noted somewhere else, you could imagine that I’ve chosen IPA notation instead.

EDIT: Actually wait, Hepburn is not bijective for zu and ji. I haven’t thought about that. It’s not relevant to any of the conjugations so it doesn’t break the article, but that may be a good argument that it’s not worth the effort rescuing Hepburn.

Because of this incomplete bijectivity, some applications, e.g. dictionaries, use a modified Hepburn system, which is bijective, e.g. by using "dzu" and "zu", instead of Hepburn "zu" (and "dji" and "ji").

Hiragana also has its problems, because the hiragana used before WWII corresponded with an ancient pronunciation of Japanese, from many centuries ago, which no longer matched the modern Japanese pronunciation.

After WWII, under the American occupation, there was a reform of the writing system, which replaced many kanji used before WWII and it also changed the spelling in hiragana of many words.

In general the modern hiragana spelling has been changed to match the modern pronunciation, but there are a few survivals of the older spelling that lead to inconsistencies.

As an example, the hiragana syllable now romanized as "ha" was pronounced for some time several centuries ago as "fa-" in initial syllables and as "-va-" in internal syllables. Then the pronunciation shifted to "ha-" in initial syllables and to "-wa-" in internal syllables. After WWII the "ha" hiragana character was replaced by the "wa" character in most internal syllables, to match the new pronunciation, except in the "-wa" postposed particle, where the "ha" hiragana character was retained, despite the pronunciation. The particle is now romanized as "wa", so going backwards to hiragana would produce the wrong hiragana character, another example of non-bijectivity, besides "zu" and "ji". Yet another non-bijectivity example is that the postposed particle normally romanized as "-o" actually uses the hiragana character "wo".

The changes in hiragana spelling after WWII are also responsible for the fact that many Japanese words reproduced in old books written in English, e.g. from the 19th century, appear quite different from how they are written today in the modern Hepburn romanization.

> I think the reader would just read the next section where I use your argument to critique my own approach? And then make up their own mind whether it’s defensible to do something in the article, to raise pros/cons for why I did it, and then to keep on with the choice.

I think that’s a long wait; I don’t want to rely too heavily on analogies but it is like teaching somebody arithmetic roman numerals and then explaining in a parenthetical that there are other ways to do arithmetic (but not naming them). Maybe the reader can make up their own mind—but I don’t think the pros and cons are raised in the article, or if the are raised, I couldn’t find it.

I don’t want to pile on here but it sounds like you are, in this conversation, learning about why the different romanizations exist and what the pros and cons are. Or if you already knew, you are getting what they call an object lesson. (Like you noted—in Hepburn, ji and zu correspond to two different kana each.)

> As I noted somewhere else, you could imagine that I’ve chosen IPA notation instead.

This just resurfaces a similar problem with different symbols—if you put your IPA notation in slashes // you get phonemes, which will get you something mostly equivalent to Kunrei-shiki romanization. If you put your IPA in brackets [] then you get something sort of equivalent to Hepburn (in that it’s designed to show pronunciation). Both choices will on some level obscure a regular pattern that could be revealed with kana or romaji. Orthography is funny like that; in both Japanese and English it can show the origin of words even when the pronunciation changes.

I think the other lesson here is that students will mostly learn morphophonology intuitively by absorbing examples with some light explanations of the rules, and if you overexplain the rules you end up with too much “scaffolding” which gets in the way. Like when people use mnemonics or try to memorize kanji by thinking pictorially.

I genuinely haven’t thought about zu/ji here (conceded!) It’s not relevant to conjugation though.

In general, I find your attitude a bit condescending. This is what I wrote about my choice:

> note i could also have used a different romanization that renders し as "si", つ as "tu", and ち as "ti" for this article. i decided to not because everyone else uses romaji, and once you understand this point once, you shouldn't have a difficulty doing this in your head

My main mistake seems to be meaning “[Hepburn] romaji” by writing “romaji”. I was obviously aware of other systems because that is what the sentence says but I thought it’s acceptable to refer to Hepburn as just “romaji” as a sort of the default one. Maybe that’s wrong.

Other than this terminology nit, I think I’ve made myself quite clear there. I genuinely don’t think it’s a big deal. Maybe I overestimate my readers’ intelligence but I don’t find this difficult to live with at all once you get it.

Roman numerals is a funny parallel but it doesn’t hold very well. The difficulty of using Hepburn is O(1) shortcut: for conjugation, you only have to “remember” three special cases and they’re always applied just-in-time. It’s just substitutions — and are arguably inherent phonetically. Arithmetic with Roman numerals requires many stacked adjustments where you have to match pairs of things. And lack of orders really screws with ability to do multiplication. This just isn’t an intellectually honest comparison.

Re: your last point I actually kind of agree. I’m that annoying student who likes to un-extrapolate backwards from examples to the rules, knowing which gives me a warm fuzzy feeling, after which I can go back to examples. My article is for people like me. Maybe there’s a few more of them.

> In general, I find your attitude a bit condescending.

Yeah—I can understand why I’d come across as condescending. There’s a balance here—I want to be clear when I say that I have problems with the article, but I don’t want to be hurtful and I don’t want to make criticisms that are not supported by the text.

Rather than defend my comments as “correct” let’s say that I failed in my goals of not coming across as condescending. The reason I want to frame it this way is that similarly, I think the article failed in its goals as coming across (to me) as “look at this neat thing about Japanese”.

It is just kind of the nature of written communication that it takes a lot of editing and polish to make it clear, correct, and concise. I had the good fortune to sign up for Japanese 101 when my professor was in the middle of writing a new Japanese textbook—it was pretty exciting, with the changing lesson plans, the flock of master’s students hanging around, revisions and drafts to teaching materials, and those endless hours of classroom observation. The teachers occasionally gave us a “peek behind the curtain” and explained why they chose to teach things a certain way or another. I’ve rarely gotten that kind of explanation in any class that I’ve taken so I thought it was pretty special.

I don’t expect you to put in the textbook-level of polish into your article but there is a kind of verbosity (the article is long, which makes it kind of hard to respond to because there is just so much to sift through), there are some problems with clarity (the issue of romanization and orthography is mixed in with the conjugation, and maybe it would be better to separate those issues) some problems with correctness (various) and some problems with completeness (the patterns omit some conjugations that I think you don’t know, and I don’t think they follow the pattern).

I have certainly put effort into articles that have gotten brutal negative feedback; I think it was right for me to write the article, and then feel like shit from the feedback, and then maybe retract and revise it. If there is one actual error here, a true error, I think the error is fighting out criticism in the HN comments.

Maybe it helps to explain the purpose of the article. My purpose is to help people who have fallen off originally with traditional approaches by showing an alternative way to build up the intuition. This precise order of layering is what I found most helpful, so that’s why I wrote it in that order. The constraints I’ve chosen (assume the reader has zero knowledge; write things as they sound; give an almost complete system in one evening) are maybe strange. And yes, my style is verbose and you could compress that by a lot if you don’t mind people stumbling. I tried to hand-hold every transition pretty closely.

So, on verbosity: that’s a stylistic choice. Not for everyone. For romanization: point taken and I agree frontloading it would’ve been more elegant. Though I kind of don’t like that it sounds wrong for an unprepared speaker.

For correctness: please provide specific issues. I’ll try to fix them. This is the part I actually care about.

For completeness: yes, some things I put out of scope break the pattern (or rather extend it — the mechanism of concatenation is the same but it actually may be easier to hard-split it by godan/ichidan). I genuinely think that by the point you learn those, you don’t need the scaffolding anyway, and the model has done its job.

I don’t feel like shit from the feedback. This is not my first rodeo. Where I have correctness issues, I would like them pointed out so I can fix. The handwringing about it being a weird way to teach — not so much. I know it’s weird; I wrote it because that’s what worked for me.

And fighting out the criticism in HN comments is half of the fun, isn’t it? :)

Again, the conceit of the article is you can learn almost the entire conjugation system in a single evening with no prior knowledge of the language. I invite you to step back for a moment, to accept that conceit as valid, and then to judge the article based on that conceit. For a serious learner, think of it as a fever dream that helps the concepts click next time you see them “properly”. For a tinkerer, think of it as a spark that gets you curious about the language.

Actually let me just try to explain my pedagogical approach and philosophy here. Maybe that makes it clearer.

I assume no prerequisites at first. So my reader has never seen a kana table and doesn’t know which syllables exist.

I choose to teach conjugation first. That’s an unorthodox choice but I like it! That’s what I set out to do. So we get far enough until it breaks down. And it breaks down when a rule (which worked so far) doesn’t help with “s” because saying “si” would sound wrong.

That’s the moment I use to teach kana table and its importance. This “you made a mistake” is a pedagogical vehicle for introducing kana rows. And we go over the exact ones that you’d make a mistake with. So each special case is walked through.

At this point we could discard Hepburn but I choose to keep going because if you know special cases, there’s no issue. And at some point you’ll learn kana anyway.

So that’s how I chose to layer it. Maybe it’s a bit unholy but I like it. It is definitely self-consistent.

I understand why you wrote the article this way, I think the lesson here is “we have learned why Japanese textbooks do not teach the content in this order” and there are a couple reasons why this order is not good:

1. It relies on people not understanding certain things. In general, you cannot expect people to have exactly the right misunderstanding necessary for a lesson.

2. Spending extra time with Hepburn reinforces it, and it shouldn’t be reinforced.

I am in general extremely skeptical of lessons which try to engineer a way for the students to make mistakes. What I have seen in real classrooms and in informal teaching is that the mistakes are habit-forming and the outcomes of this kind of engineering are unpredictable.

Mistakes are appealing to the developers on HN because we understand things more by seeing them fail. But this does not mean that you can engineer somebody to experience the same moment of enlightenment that you did, because it requires constructing the same (incorrect) mental model that you had when you made that mistake that led to useful insight, and it both difficult and counterproductive to try and make that happen to students. Give people the best chances to learn by giving them the best chances to avoid mistakes, and the mistakes and insight will happen organically on their own, in unique ways for each student.

Okay I see where you’re coming from, sure. We assume different things about the reader and the context of the article. This is an article for people like me who like this approach — maybe engineers or people with an engineering mindset.

I also genuinely think it’s not that deep and that there’s no complex mistake being engineered here. I don’t believe you that the “mistake” of “sa with a replaced by i must be si” is an an unusual one for someone who hasn’t yet internalized kana. If we test this on random people on the street, I’m highly confident an overwhelming majority will make this exact mistake.

I agree with your broader point that “teaching via mistakes” is a risky path not worth it when the mistakes start getting combinatorial. I also think it’s absolutely fine when everyone does the same exact mistake, and there’s exactly one way to avoid it.

> My main mistake seems to be meaning “[Hepburn] romaji” by writing “romaji”. I was obviously aware of other systems because that is what the sentence says but I thought it’s acceptable to refer to Hepburn as just “romaji” as a sort of the default one.

"Romaji" does not (in English) mean "romanisation", as most people who've studied Japanese to at least beginner level know.

“Romaji” in Wikipedia literally redirects to “Romanization of Japanese” which says “This method of writing is sometimes referred to in Japanese as rōmaji”. It sounds like your effort might be better spent fixing those problems in Wikipedia before continuing with the rest of the discussion here.
> "Romaji" does not (in English) mean "romanisation"

> This method of writing is sometimes referred to in Japanese as rōmaji

See how there's not actually a contradiction there?