Hacker News new | ask | show | jobs
by stuartjmoore 4661 days ago
A larger space (em-quad) is not the same as two spaces.
1 comments

We're talking about visuals here, so how isn't it?
An em space is the width of the letter m. IIRC, a normal space in most proportional fonts is closer to the letter n.

It has always been my understanding that historically, typographers tended to prefer em spaces between sentences (vs. after intra-sentence punctuation like 'Mr.' or a comma). And so, once typewriters with their fixed-width fonts came out¹, people used a double-space at the end of a sentence to approximate an em space.

The frustrating thing, typographically-speaking, is that the HTML approach doesn't map to the manually-typeset process, either, since it doesn't have any semantic knowledge about "end of sentence" vs. "random intra-sentence punctuation" and thus treats them all the same.

(Note that this is also where we get em dashes and en dashes from. And just like em spaces and en spaces, an em dash is transliterated to '--' in fixed-width fonts.)

¹ For all I know, the double-space trick was used with fixed-width letterpress before the advent of typewriters. The problem seems to have more to do with fixed-width than with typewriters.

The thing is that the period on a typewriter will usually be aligned towards the previous letter, so a "period space" sequence on a typewriter will have nearly the same amount of space as would be seen in a "period space space" sequence on a software word processor when not using a fixed-width font. In essence, this furthers the idea that the double space is an emulation of the output of a typewriter, rather than supporting the idea that it's influenced by typesetting (where in the past the space between words might have been 1/3 or 2/3 the width of the space after a period, if no other space was added throughout the sentence for alignment).

For the most part, the article justifies simply blaming publishers for becoming lazy, though it seems to me that they choice of spacing around sentences is largely determined by the market for which something is being published, and most publishers would have invested at some point in a decent lexer that can handle enough of the burden of finding the ends of sentences to allow the process to be largely hands-off (and allow the appearance of the output to change fairly easily if they want to print a special or mass-market edition later).

It's conceptually a single space, but the difference between 'two spaces' and 'a space twice as wide as normal spaces' isn't very important. Since lines won't wrap mid-space-block, it's effectively an encoding issue.