Hacker News new | ask | show | jobs
by dalke 25 days ago
There are two books you should take a look at.

"Methods of Information Handling", Charles Bourne - https://archive.org/details/methodsofinforma0000unse/page/n5... (not available to read)

"Punched Cards: Their Applications to Science and Industry", https://archive.org/details/punchedcardsthei0000robe (can be checked out)

The first of these is a much better book.

They show a number of ways to encode values, such as a 5-hole triangle code, where the "O" in the following indicates a hole:

    O   O   O   O   O
      9   5   2   0
        8   4   1
          7   3
            6
To encode the value "8", notch out the holes (indicated with "U") which cross to 8 (indicated with "\" and "/"), like this:

    U   O   U   O   O
     \9   5/  2   0
       \8/  4   1
          7   3
            6
This lets you encode 10-digit values with 5 holes and two needles -- remember, inserting the needles takes time. (And there were all sorts of devices made to minimize that time.)

There were also extensions of these to handle names, used to search small (< ~10,000) literature collections, and more.

The most mathematically sophisticated is likely Zatocoding, a superimposed coding method related to Bloom filters.

The Bourne book goes into these variations in detail.

Here's a video of someone scanning in edge-notched cards for bird identification. https://youtu.be/MBwP3YOxw3I

About 10 years ago I really go into the topic and made some cards of my own, using a cutting machine to make each card, precut, from an SVG.

1 comments

This way of encoding numbers is great.

I knew the 7-4-2-1-S way to get 1-9 from 5 holes, and it is more intuitive to use, but once you write out the pattern for yours it is also easy to see

You should take a look at the Bourne book, if you can get hold of a copy. It's an amazing book which draws on the generation or more experience in working with mechanical systems, and the first generation experience in using digital methods. Here's what it says about the code you mentioned, compared to the triangle code.

"From a space standpoint, one of the most efficient and most commonly used schemes is the 7-4-2—1 code, which only requires four holes to represent any number between 0 and 14 (Fig. 5-6f). Normally the four holes are only used to represent the digits 0 to 9, with zero represented by an unnotched code field. This introduces some ambiguity since a field with no punches in it may represent either missing data or a zero digit. The 1 hole will be punched every time that the numbers 1, 3, 5, or 8 are punched. If there was a need to search the file to select the cards which were punched with a 1, then a single needle pass in the 1 hole would also drop out all the cards which were punched with a 3, 5, or 8. Similarly, a single needle pass in the 4 hole would also select the cards notched with 5’s and 6’s. For this reason, this particular code does not lend itself to rapid selective searches. However, it is extremely useful for applications that require sorting cards into sequence."

"Several other coding schemes that also permit relatively rapid serial sorting are available. These schemes, at the cost of using more code positions than the 7-4—2—1 code, incorporate a triangular display on the card in order to simplify the punching and recognition by the user (Fig. 5—6k, l). Two holes are punched for every digit. The positions to be punched are the ones whose guidelines intersect at the desired digit on the printed display. Serial sorting is performed in the same manner as for the 7—4—2-1 code except that one more hole must be needled. One efficient triangular display scheme uses four code positions with a double-row card (Fig. 5-6m)."

Here's a copy of the section on encoding methods - http://dalkescientific.com/opto_Bourne.pdf .

I got really into edge-notched cards about 10 years ago. I made my own cards for handling chemical data. You can see them at 43:41 of my talk at https://youtu.be/y6dUkCxlrd8 . I also visited the Calvin Mooers archive at UMN because of his work in Zatocoding, Chemical Zatocoding, connection tables, and substructure search (Zatopleg). He mostly published as white papers from his own company, which makes it hard to find copies anywhere else.