Hacker News new | ask | show | jobs
by monodeldiablo 1282 days ago
A small correction to your correction: The letter 'nj' is actually a single character in the Croatian/Slovenian and Latinized Serbian/Montenegrin/Bosnian alphabets (go ahead and select it with your cursor!), as are the letters 'lj' and 'dž'.

While they are certainly digraphs, they are regarded as a single letter and not a 2-character sequence. They have their own sound, sort order, Unicode designation, and written orthography. For example, on advertising where a word is written vertically, 'lj' will not be separated vertically, and a hyphen never separates the 'l' from the 'j'.

However, since the letters 'n' and 'j' already exist on a keyboard, it's easier in this electronic era for people to type the letters separately instead of hunting for the 'nj' key, so the presence that you see of two character sequences to represent those letters is a consequence of the compromise of modern electronics and expediency, not innate to the alphabet itself.

TL;DR: Each letter in Serbian Cyrillic maps 1:1 to a single letter in Gaj's Latin alphabet, as each alphabet was specifically designed such that each character represents exactly one phoneme.

3 comments

This is not a unique development, of course; the letter W/w was originally a digraph written VV or uu (where V/u is of course a single letter).
This reminds me of being taught that "ch" and "ll" were a single letter each in Spanish back when I was in school before they were re-digraphed in the late '90s.

Adapting alphabets to languages has a long history: just look at how the Greeks butchered the Phoenician writing system with weird concepts like "vowels" and "F". This is something that makes languages unique, and it should be chosen over having digraphs or diacritics.

A small correction - lj and nj are separate characters in Slovene alphabet.
I stand corrected³!