Hacker News new | ask | show | jobs
by cryptonector 1306 days ago
A modern solution is to abandon old, no-longer-relevant typographic language rules, or to make typographic language rules context-specific.

But I agree that we need to make several alternative space characters easy to type:

  - non-breaking space (for this French rule)
  - wide space (for disambiguating sentence
    ending periods from non-sentence-ending
    periods)
  - zero-width non-breaking space (for
    preventing word-splitting?)
1 comments

What makes the French rules old, no-longer-relevant, invalid? Why should we change language to appease lazy software developers?
Not "rules", just this rule. It has to do with typographic considerations that apply to old typesetting technologies that are no longer in use.

> Why should we change language to appease lazy software developers?

It's been done.

For example, in Spanish it is no longer the rule that "ch" and "ll" sort as if they were distinct letters (this change was made in 2010) precisely because that was such a difficult rule to implement. And that was a 256 year-old rule per-wikipedia:

  The digraphs "ch" and "ll" were
  considered single letters of the
  alphabet from 1754 to 2010 (and
  sorted separately from "c" and "l"
  from 1803 to 1994).
For another example, in Spanish capital letters were required to not carry accents, but now they are allowed to not carry accents. This was due to the use of overstriking on typewriters working to accent lower-case letters but not upper-case letters (apostrophe would collide with the glyphs for upper-case vowels). But the technology to resolve this has existed in the Spanish-speaking world for a long time now, so the rule was finally dropped. (Not accenting upper-case letter can lead to ambiguities that are annoying.)

It's not just precedent. It's that the original reason for some typographic (not even orthographic) rule is simply not relevant in 2022.

And it's not unreasonable for either French people, non-French French speakers, or just non-French French-non-users to propose the ditching of hard-to-implement French rules. Now, this particular rule is decidedly not difficult to implement, but it is an annoying rule to apply as a user -- I should know, since I speak and write French (though I am not French).

Also, it doesn't matter what the French Academy says, or what the Spanish Royal Academy says, or what Webster's dictionary says, or whatever. Language evolves, even to their consternation. Moreover, developers don't have to care that much -- I18N/G11N is fun enough, and employers have to care for legal reasons, but rules like the Spanish ch/ll rule can be much too hard even for non-lazy developers, and the Royal Spanish Academy can and did have to change, and it was for the better.

In contrast, the Hungarian "cs", "dz", "dzs", "gy", "ny", "sz", "ty", and "zs" all remain distinct, as do their accented vowels. Polish is hybrid, considering digraphs as being 'composed' of single letters (i.e. 'sz' = 's+z'), rather than being distinct, but the accented characters are considered distinct from their unaccented cousins.

This problem has primarily come about because the Catholic church enforced the latin alphabet on languages where a different alphabet might have been more appropriate. Spanish, although more closely related to Latin, still has a few sounds which there's not a good latin character for. There's no (particular) reason (as far as I'm aware) that 'ch' and 'll' became diagraphs, while ñ acquired an accent.

Why should traditions change just because it's a bit more difficult to do things the 'old way'? why do we still bother with capital letters at the beginning of sentences? or speling things with two leters when one wil do, or riting silent leters wen you cant tell the difrens? & i dont think we need apostrofees n e more.

Indeed, there's no reason for 'ch' and 'll' to have been distinct digraphs in Spanish and ñ not to have been 'gn' as in other Romance languages. I'm not familiar with the whys of that.

I didn't know that Hungarian had a similar issue.

> Why should traditions change just because it's a bit more difficult to do things the 'old way'? why do we still bother with capital letters at the beginning of sentences? or speling things with two leters when one wil do, or riting silent leters wen you cant tell the difrens? & i dont think we need apostrofees n e more.

I distinguish typographic and orthographic rules. The non-breaking, thin space before punctuation rule is typographic and outdated (i.e., motivated by outdated typographic technology).

I do want some orthographic rules reformed too, but I'm more interested in the ones that are just hard. In particular I'm interested in collation reform because we do often have to collate multi-language text items but with one collation -this is especially true in databases- so having collations for Latin-script-using languages be similar is rather useful. This is also true given that I'm not going to be switching locales when I switch languages -- I speak, read, and write multiple languages, but I never ever change locales.