Hacker News new | ask | show | jobs
by eesmith 941 days ago
The comments point out conversion issues with EBCDIC. You can't use ASCII characters like @ which are not in EBCDIC.

https://datatracker.ietf.org/doc/html/rfc2045#section-6.8 says:

   This subset has the important property that it is represented
   identically in all versions of ISO 646, including US-ASCII, and all
   characters in the subset are also represented identically in all
   versions of EBCDIC. Other popular encodings, such as the encoding
   used by the uuencode utility, Macintosh binhex 4.0 [RFC-1741], and
   the base85 encoding specified as part of Level 2 PostScript, do not
   share these properties, and thus do not fulfill the portability
   requirements a binary transport encoding for mail must meet.
If you want to learn why ASCII is the way it is, try "The Evolution of Character Codes, 1874-1968" at https://archive.org/details/enf-ascii/mode/2up by Eric Fischer (an HN'er). My reading is contiguous A-Z was meant for better compatibility with 6-bit use.
1 comments

I thought the ASCII upper-case <-> lower-case being a bit operation as being clever.
> I thought the ASCII upper-case <-> lower-case being a bit operation as being clever.

From "Things Every Hacker Once Knew" (2017), has an entire section on ASCII and the clever bit-fiddling that occurs:

* http://www.catb.org/~esr/faqs/things-every-hacker-once-knew/...

* Discussion from ~2 months ago: https://news.ycombinator.com/item?id=37701117

In the context of a terminal, the Control key is also a bitwise operation.

Shifted numerals were nearly a bitwise operation as well, but we didn't end up using that keyboard layout.

Yep! There's even a term for it -- this is called a bit-paired keyboard.
Yes, though in principle you could interleave AaBbCc and so on, which would also be a single bit difference, and the naive collation would be more like that people expect.

The design considerations at https://ia800606.us.archive.org/17/items/enf-ascii-1972-1975... show that 6-bit support was more important than naive collation support:

> A6.4 It is expected that devices having the capability of printing only 64 graphic symbols will continue to be important. It may be desirable to arrange these devices to print one symbol for the bit pattern of both upper and lower case of a given alphabetic letter. To facilitate this, there should be a single-bit difference between the upper and lower case representations of any given letter. Combined with the requirement that a given case of the alphabet be contiguous, this dictated the assignment of the alphabet, as shown in columns 4 through 7.

I just found and skimmed Bob Bemer's "A Story of ASCII", which includes personal recollections of the history. It seems that the 6-bit subset was firmed up first. From https://archive.org/details/ascii-bemer/page/n17/mode/2up?q=... :

> This is reflected in the set I proposed to X3 on 1961 September 18 (Table 3, column 3), and these three characters remained in the set from that time on. The lower case alphabet was also shown, but for some time this was resisted, lest the communications people find a need for more than the two columns then allocated for control functions.

but serious discussion of lower case wasn't taken up until later. From https://archive.org/details/ascii-bemer/page/n25/mode/2up?q=... :

> ISO/TC97/SC2 held its next meeting in 1963 October, at which time it was decided to add the lower case alphabet.

and at https://archive.org/details/ascii-bemer/page/n27/mode/2up?q=... :

> At the 1963 May meeting in Geneva, CCITT endorsed the principle of the 7-bit code for any new telegraph alphabet, and expressed general but preliminary agreement with the ISO work. It further requested the placement of the lower case alphabet in the unassigned area.

Bemer did not like interleaving lower- and upper-case. From https://archive.org/details/ascii-bemer/page/n5/mode/2up?q=l... :

> I had a great opportunity to start on the standards road when invited by Dr. Werner Buchholz to do the main design of the 120-character set [9,24] for the Stretch computer (the IBM 7030). I had help, but the mistakes are all mine (such as the interspersal of the upper and lower case alphabets). ...

> he didn't make the same mistake I made for STRETCH by interspersing both cases of the alphabet!