Hacker News new | ask | show | jobs
by darklajid 5290 days ago
That's a fascinating topic for me, for two reasons:

1) At the local Hebrew lessons I met a minister of the embassy of South Korea. He told me that Korean is a praised language all over the world (it was news to me - make of it what you want) for its simplicity and therefor speed for typists. He elaborated and said that both the layout (keyboard, I assume) would be very sensible and every 'character' is actually a combination of consonant-vowel-consonant and thereby simple (triplets, always) and carrying a lot of information. Since then I'd like to learn more about this idea and confirm or bust that claim.

2) Learning Hebrew is hard. A real quote from a coworker was "It's an easy language! We only have 22 letters, after all". Reduce your alphabet (alephbet?) from 26 to 22. Note that of these letters, 5 are only special versions of other letters and replace those in the last position of the word. Which leaves 17 letters for most words/the meat of the language. And most words are rather short (okay, okay.. I'm not comparing to German here, that would be pointless. Even compared to english it seems to be the same or shorter to me).

Bottom line: I still have a bet going that I can generate Hebrew line noise (following the rules of going with the 17 letters and adding the required sofit/end letter if required. Gibberish ending in נ would be 'fixed' to end in ן) and will hit word after word. On my list of possible weekend projects I have an entry 'Hebrew or not' to crowd-source this.

3 comments

"[Hebrew] only has 22 letters, after all. [...] Reduce your alphabet (alephbet?) from 26 to 22. Note that of these letters, 5 are only special versions of other letters and replace those in the last position of the word"

That's incorrect. All 22 letters are actually letters, the special versions of letters for the ends of words are not counted towards the full 22.

Also, I think Hebrew words tend to be shorter because they lack vowels. There are ways to add something similar to vowels to words, by adding pronunciation guides to each letter. These are usually not included in most Hebrew writing, but this trusts that the reader already knows how to pronounce the word.

I'm not a linguist, so I'm not sure this vowel thing matters, but that's my guess as to why Hebrew words are shorter.

Ugh. Thanks for pointing out my counting mistakes. Mea culpa.

Vowels: You're right, of course. My bet originated during lunch talks. My intuition (in other words: more stupid mistakes ahead, maybe..) says that by leaving out the vowels and overloading letters (b or v? f or p? u, o or v? etc. pp.) the language loses a lot of error correction margin [1] and leads to more collisions/a denser field of 'actual words' [2].

1: That refers to the ability of taking western languages and removing all vowels there. Or stripping out random letters etc. I'm certainly _far_ _far_ from an adapt reader here, so I'm musing about things that interest me although I lack the required experience.

2: Which leads to my 'Hebrew or not' idea. My gut says that randomly pounding the keyboard results a lot more often in 'real words'.

Not bashing hebrew. I even like the script by now (in the beginning hand-written text looked especially random to me).

I hope a linguist will chime in here, it's an interesting subject and I'm wondering if my guesses on the vowel-less nature of Hebrew are correct.

I'm not sure how much this will change your "random pounding on keyboard" idea, but English has a much larger vocabulary than Hebrew.

P.S. I see you're in Israel now, hope you're enjoying your time in Tel Aviv.

Tel Aviv: My first winter wearing t-shirts. Cannot complain. :)

vowel-less: (I don't need to repeat the 'I suck at hebrew' disclaimer, right?) Kind of. From what I know:

There are no explicit letters for 'a', 'e'

You can represent an 'o' or an 'u' with a ו (otherwise used as consonant, 'v'). I know for a fact that words with 'o' can be written without a 'vowel' letter (לא for example: no). I don't know if 'u' is always represented as a letter of it's own.

'i' is often used as 'ij' or 'ji' and teams up with 'י = j' in that case. It can be represented without a letter just as well though.

Bottom line: Except for 'u' (no idea about that one? maybe just as well?) you can have all vowels 'hidden' in plain sight.

Would love to have someone from IL chime in here though and correct all my mistakes.

I'm from IL :) [1]

You're right about the י and ו replacing vowels much of the time. Inspired by this discussion, I went and read a little more about the history of Modern Hebrew, and stumbled on this page in Wikipedia: http://en.wikipedia.org/wiki/Ktiv_male

The idea of writing vowel-like signs into the letters is called Nikkud. But apparently, since most people don't write Nikkud, the Academy of the Hebrew Language wrote a set of rules explaining how to exchange Nikkud for letters that will serve as vowels. I had no idea it was so deliberate, but this explains why there are many words which people here write differently.

I still think there is more ambiguity in Hebrew. Or, as you put it, less "error correction margin". Even with "Ktiv Male".

[1] I lived abroad for a few years, so I wasn't in Israel from the 2nd grade to the 7th. This means I missed a lot of the "traditional" learning process of learning about the language, grammar, etc. So I'm a native Hebrew speaker, but I have some gaps in my knowledge about the correct way to do things, etc.

I learned a bit of Korean script and could touch type in it on a US keyboard. It's a fairly nice system, although there seems to be a fair bit of non-rational pride in it, too. Woe to anyone that points out any similarities to previous writing systems...

But it's not just triplets. You can have 2, 3, or 4 components per character (Wikipedia says 5, but I'm not sure how that works). The first component must be a consonant, but there's a null consonant too, allowing you to create syllables with just a vowel sound.

I don't see how the writing system itself helps typists. If words are shorter, that's a function of shorter words, not the writing system. A syllable still requires several keystrokes each.

That said, it's very simple to learn. I learned it in a week. But, I also learned the Japanese scripts (hiragana/katakana) in a week as well using James Heisig's awesome book[1]. So, perhaps learning alphabet scripts is just not an overly difficult task in general?

1: http://www.amazon.com/Remembering-Kana-Hiragana-James-Heisig...

From what I understand, the Korean written language was pretty much invented wholesale by one man (the king at the time, I believe). As such it's a lot better designed and internally consistent then other written languages (which accumulated naturally over time).

That said, while it may be easier to learn, I imagine people still read it at about the same speed they would read English or Chinese. I don't remember where but a while back I read an article saying that basically all spoken languages impart information at roughly the same rate (e.g. languages with higher information density are spoken slower, languages with lower information density are spoken faster). And from the OP's article it sounds like this applies to written languages as well.