Hacker News new | ask | show | jobs
by evmar 11 days ago
One thing I sometimes think about when I think about text layout problems is how the text we use also has a bunch of complexities that we can take for granted.

Think of variable width characters and kerning and ligatures and hyphenation and justification. Imagine computers had been won by a CJK language, which have none of these problems. You could imagine a similar article about how exotic and difficult English layout is.

3 comments

Both Latin and Chinese have been modified by the technology used to write them.

When carved in stone the lines are much straighter. When written with brush or pen they became semi-cursive. When printing was introduced, they became grid-like and regular.

What westerners who are passingly familiar would think of as the standard Chinese typeface - the strict square grid with straight-line characters - arises in part from printing technology. Easy to carve that into wood blocks, and easy to line up the slots into a grid.

Latin was similarly morphed to fit into the realities of printing in the 1500s. And is still being morphed. Notice how numbers 123... are in-line and at the same height as the letters. That's a very modern convention, typewriter and computer influence on our orthography. Traditionally digits were more likely to appear as subscript, off-centre.

what selective pressures against oldstyle numerals with ascenders/descenders existed that wouldn't have equally applied to letterforms with those same features?

(aha i have found the answer to my own question: miniaturization for fractions in phototypesetting)

Not really. The selective pressure really comes well before that: Tabular presentation of numbers, whether that was log/trig tables or railroad time tables, there was a preference for uniform-width and regular height characters for those contexts (this is also why there is a number-width parameter in TT typography to enable a designer to let digits be variable-width in text but still allow tabular setting if desired).
the other part is that numbers and symbols were very much not the priority. The printing press was for books, magazines etc. math remained hand written until the computer
Nope, not at all. Monotype had a special system for doing math in hot metal typesetting. With handset type it was possible, but very time-consuming. You can find typeset mathematics going back centuries before the computer. There were also (somewhat impractical) systems for setting music with metal type although engraving was more common because of the interactions of lines and symbols.
Conversely, English has a joined form(cursive) that is nearly dead because mechanical text assistance devices (first typewriters, now computers) work much better with the block form. While sad in a cultural loss sort of way the joined form only really makes sense when the text is hand written.

I am not familiar with the history of Arabic typography, but I sort of assume there was an archaic block form and their current joined form is the result of many centuries of encoding hand writing practice. advanced enough that falling back to a block form is impossible with the side effect of making simple mechanical text formatting also impossible.

As for Chinese derived characters. we currently are able to jam them awkwardly into our alphabet optimized structures(one code per character) but I wonder if a Chinese native encoding would look different. Would it make sense to try and represent the sub-characters present in each Chinese character in the encoding? I suspect not, Chinese works, but it also does not appear amiable to simple mechanical assistance.

There's the https://en.wikipedia.org/wiki/Ideographic_Description_Charac... that kind of does that. The problem is that there's character divergence (see all the brouhaha about Unicode Han unification), so there needs to be something else to select variants too.

As a reference, I don't believe any of the pre-Unicode CJK&c encodings attempted that.

Another wrinkle with Arabic is linguistic conservatism. Due to Islamism and the idea that Arabic is the language of of God (the Quran was written in Arabic by the supposedly illiterate prophet), Arabic has lagged behind other languages in terms of innovation.

Hebrew is a closely related semitic language that simply adopted a block and cursive form. It has also been greatly simplified and friendlier towards loanwords, which has made it far easier to learn.

Muslims don’t believe Arabic is the language of God. They believe that the Quran was revealed in Arabic (true). Thinking the creator of the heavens and earth only speaks one language is absurd. It also kind of implies that Muslims believe in a superiority of Arabs which is also not true.

Weird to say Arabic hasn’t innovated or evolved considering the wild variety of dialects spoken in the modern world.

Conflating the language with the script is also bizarre. In terms of adapting Arabic to technology, look into romanized Arabic which was used before Unicode was common.

I didn't write "God only speaks Arabic" in Islam. That's your intepretation of my post. All I meant was that Arabic has special status in Islam.

> Weird to say Arabic hasn’t innovated or evolved considering the wild variety of dialects spoken in the modern world.

I didn't say Arabic has not innovated or evolved; only that it "has lagged behind other languages in terms of innovation". My belief is that that is due to linguistic conservatism, and linked to Islamism (or, at minimum, the centrality of Islam in Arab culture). Also related to this is the existence of Fusha, its place in Arab culture, and its branding as "modern standard Arabic".

I didn't conflate anything. While a script and a language are not the same, it's not a coincidence that Arabic is often written today in a script that is very close to Quranic script. And -- to really kick the hornet's nest -- it's also not a coincidence that there have been so few outstanding Arab writers (in Arabic) in the past 100 years. One novelist and a couple poets.

> And -- to really kick the hornet's nest -- it's also not a coincidence that there have been so few outstanding Arab writers (in Arabic) in the past 100 years. One novelist and a couple poets.

Now, reading that point one might ask the question if writing has been properly funded, or if the priority of cultural funding in the Arab world has been lower than, say, the funding of architecture and other forms of art. And on top of that, I'd also have a serious look at the market size, especially when compared with English-language writing.

With all due respect, your comment comes off as a bit ignorant and rude. A few points:

Firstly, the Qur'an wasn't written by the Prophet, he would dictate it and it would be written by his scribes.

Secondly, it's hard to argue that Islam has had a negative effect on Arabic or caused it to lag behind. In fact, it's easy to argue for the opposite. It's a historical fact that the Arabic language developed and proliferated rapidly due to the rise and spread of Islam. This is when its script and grammar were standardized, and when more and more works started being composed. And shortly thereafter the Islamic Golden Age began.

I don't have any issue with Hebrew, and maybe it is easier to learn. But this is because it was a dead language which was revived, resulting in a simplified language. Almost every other major language on Earth will have the same amount of "innovation" as Arabic. In fact, Arabic has many colloquial dialects which are used in day to day conversations, and these do consist of a simplified version with many loanwords. So I really don't know what you mean by a lack of innovation.

I don’t think anybody said that Arabic has suffered a complete standstill, and it has doubtlessly evolved significantly.

But if you compare it with basically any other major language, it’s clearly much, much more conservative. If you are a native English speaker, understanding English from 1,000 years ago is like learning a completely different language. If you are a native speaker of Italian, you cannot understand a text in Latin without significant training. This is true for all European languages other than Icelandic.

Chinese is pretty similar, even though the written language is slightly more stable.

So in comparison, Arabic is incredibly conservative.

There is no one "Arabic". Yes, formal modern Arabic (fusha) is based on (but not identical to) the classical Arabic of the Quran, but nobody speaks this in real life. The actual Arabics are the 20-odd spoken languages, many of which are effectively different languages at this point:

https://en.wikipedia.org/wiki/Varieties_of_Arabic

A rough equivalent in both time and space is how the Vatican continues to use Latin, but the rest of the Roman Empire has splintered into Italian, French, Spanish, Romanian, etc.

> but nobody speaks this in real life

They speak it on tv and it's written in newpapers. They learn it in schools. Educated Arabs code switch into Fusha all the time. Islamist leaders (e.g. Nasrallah) speak Fusha in their broadcast speeches.

It's also pretty hard for foreigners to learn an ammiyya (outside of immersion). "Studying Arabic" almost always means Fusha.

I agree with you that "the actual Arabics are the 20-odd spoken languages". In a healhier culture, Fusha wouldn't exist or would have the same cultural place as Latin in the Western world.

Also worth noting that unlike Arabic and Islam, the Jewish tradition is that Hebrew is in fact the language of God and was the pre-Babel language.
Twitter trolls are on HN now?
This is such a bad take on the issue.
There is no unjoined form of Arabic. The Arabic script became Arabic when Nabataean script started developing joined letter forms. Unjoined Nabatean is as foreign to Arabic as Phoenician is to Greek.
Looking at dictionaries and printing presses from China before the invention of computers reveals that they probably would have done something similar to ascii, just with more bits to encompass all the characters.
CJK languages can include vertical and RtL stretches too, to complicate matters. Here's some lyrics I made as a test:

https://codepen.io/kingcharlesone/pen/GgRXLoM

Japanese magazines usually mix three different script types on a majority of the pages like this:

https://imgur.com/a/x61XbIV

(In another quirk some Japanese mags open right-bound, others open left-bound)

Chinese apparently was originally always written vertically top-to-bottom. (And then columns would be right-to-left.) Modern Chinese just rotates everything except the characters themselves 90 degrees to the Latin order.

I also read that a few Chinese texts only make sense in vertical order: one had a pun where the characters read one way as separated characters, but as stacked was also a single character pun for something like a "crumbly cookie".