Hacker News new | ask | show | jobs
by adrian_b 23 days ago
The calculator chip 4004 never had any relevance for computers, and it did not have "bytes".

Intel 8008 did not have anything original in its architecture, it was just a monolithic PMOS re-implementation of the CPU of the embedded computer designed for the serial terminal Datapoint 2200, which had been designed with TTL integrated circuits. All the decisions about sizes, e.g. 8-bit data and 14-bit addresses, had been done by Datapoint in 1970, not by Intel. Datapoint had chosen 8-bit bytes in order to support the recently standardized ASCII 7-bit character set (and the 8-bit IBM EBCDIC character set, if necessary), i.e. the character sets used by the computers to which such a serial terminal could be connected.

At the time when the first microprocessors were designed, during the first half of the seventies, the most important architectural influence on any new computer designs were the DEC PDP-11 minicomputers.

DEC PDP-11 used 8-bit bytes, which was a significant change from the previous DEC computers, most of which used word sizes that were a multiple of 6, like 12-bit, 18-bit or 36-bit.

DEC PDP-11 had transitioned to 8-bit bytes (in 1970) mainly due to the influence of IBM System/360. The standardization of the 7-bit ASCII code for characters, which could no longer fit inside 6-bit bytes, has contributed to this decision, but the standardization of ASCII was itself possible only because many computer vendors had already transitioned or decided to transition to 8-bit bytes, so they could store the new ASCII characters in their bytes.

After 1967, when ASCII was standardized in a form close to the present form, after which it was also taken into international standards by ISO and CCITT, all new computer instruction-set architectures were designed with 8-bit bytes.

1 comments

In this document [1] dated 1967-68, on page 8, IBM mention 8-bit character sets only: their EBCDIC and the "8-bit extension of the 7-bit code" proposed by ISO.

Because eight rather than six bits are used to represent a. character, up to 256 possible characters could be represented in the Extended Binary Coded Decimal Interchange Code (EBCDIC) shown in Figure 7. Except for certain teleprocessing equipment, the code that makes use of characters is either EBCDIC or an eight-bit extension of a seven-bit code proposed by the International Standards Organization.

[1] http://bitsavers.informatik.uni-stuttgart.de/pdf/ibm/360/GC2...

IBM System/360 normally used the IBM EBCDIC 8-bit character set, which was designed simultaneously with the decision to use 8-bit bytes, before the launch of System/360. All the IBM computers older than System/360 had used 6-bit character sets.

The 7-bit ASCII code, with a few small differences from the current version, had been standardized in 1967 in USA as USAS X3.4-1967 and internationally as ISO 646.

No computer has ever been designed with 7-bit bytes, so all computers with 8-bit bytes, starting with IBM System/360, store the ASCII characters using an "8-bit extension of the 7-bit code".

Your document refers to this. The fact that ASCII has only 7 bits mattered only when the characters were sent over communication lines, when there was no need to transmit more than 7 data bits.

While in IBM systems EBCDIC was the primary character set and ASCII was used only for interchange with computing equipment made by other companies, in all the computers made by others the usage was reversed, ASCII extended to 8 bits was the primary character set, but EBCDIC was also supported for data interchange with IBM computers.

In general, I agree with your conclusions. However, I found it interesting that this document made no mention of ASCII or other 7-bit character sets. Especially since the first version of the standard (X3.4-1963, no lowercase) was already several years old at that point.

https://www.sensitiveresearch.com/Archive/CharCodeHist/X3.4-...

That document referred very clearly to ASCII (the 1967 variant), by "seven-bit code proposed by the International Standards Organization". There was no other character set that could be referred to by these words. Probably at the time when the document had been written it was not known yet that the number of the standard would be ISO 646.

The 1963 ASCII version was very different from the 1967 and later versions, it did not even have lowercase letters.

ASCII-1963 must be considered as a different character set from the later ASCII versions. It had a very limited adoption as a method for storing text in computers, because in the beginning only IBM System/360 had 8-bit bytes, while most other computers still had 6-bit bytes, and System/360 used a much more complete character set, EBCDIC, for storing text.

Thus ASCII-1963 was used in the beginning only for communication on serial lines, e.g. in terminals like Teletype Model 33, where it had the advantage of having more control characters than 6-bit character sets, even if it had only about the same set of printable characters. For storage, an ASCII-1963 string would have been converted to some 6-bit character set, because there was no need to store control characters and the number of printable characters was less than 64.

In most contexts references to "ASCII" should be understood as referring only to the 1967 and later versions, which were complete 7-bit character sets and which were adopted as both US and international standards.

But does it really matter what the details were? The most important thing is that the standard published in 1963 was 7-bit. I mentioned that the 1963 version did not include lowercase letters. The (unpublished) 1965 version, mentioned on the first scan page, did.

As for the name, the acronym ASCII comes from the 1963 version (American Standard Code for Information Interchange). Later in 1966, ASA became USASI, and the official name was changed to USASCII, with ASCII as an acceptable alternative abbreviation. Later still, in 1969, USASI changed its name once again to ANSI, and an attempt was made to rename it ANSCII, but this did not catch on, and ASCII returned as the official name.

As for this 8-bit extension (not seven-bit code proposed by the ISO), perhaps they were referring to ECMA-35, the first version of which was published in December 1971? Or perhaps other proposals mentioned in the brief history. Of course, it seems that ASCII - regardless of the version - served as the basis for these extensions.

https://ecma-international.org/wp-content/uploads/ECMA-35_1s...

My point was that before the 1967 version, ASCII had no influence whatsoever on the design of computer architectures, because it was useful only for transmission on serial communication lines and it remained compatible with the use of 6-bit character sets for storing character strings in the computer memory.

Only after the number of printable characters had been greatly increased in 1967, making impossible the conversion to 6-bit character sets, and the new version was adopted not only in USA, but also internationally, by both ISO and CCITT, it became a necessity to have a byte size equal to or greater than 7 bits, in order to be able to store efficiently ASCII strings in computers.

From that moment on, the 8-bit byte size became a hard requirement for any new computer ISA, e.g. for DEC PDP-11, which was designed mostly during 1969 and it was launched in 1970.