Hacker News new | ask | show | jobs
by discardorama 4728 days ago
(Using a throwaway account)

I worked on the OCR systems. Fun fact: at one time, the USPS was the world's biggest user of Linux in a production setting. Their OCR boxes ran on Linux (until they were replace with SGI O2 boxes at a massive cost... but I digress).

Here's the path the mail takes: it is picked up by carriers from the mail boxes. Then dump trucks bring it to the P&DCs (Processing and Distribution Centers). There are about a 1000 PDCs in the country, I think. There, mail is dumped into a massive conveyor belt, where the first machine (AFCS, or Automated Facer Canceller System) makes sure that the mail is facing the right way, and is upright. Various heuristics are used for this. Here the mail is stacked nicely into flat boxes, vertically.

Postal workers then feed these boxes to the MLOCR (Multi-Line OCR) machines. These machines scan pieces at the rate of 13/second. After being scanned, the letter goes on a long loop before coming back to the beginning: this loop, about 3 seconds (not sure about this) is the latency: the reading machine has this much time to decode the address. Also at this time: a fluorescent barcode is sprayed at the back of the piece, giving it a unique ID. If the OCR machine can read the address, it is sent to a bin indexed by the first 2 digits (or so) of the ZIP code (assuming it's not local).

If the OCR can't read the mail, it is sent to a separate pile. Then a program called RCR (Remote Computer Reader) kicks in: a person sitting in some remote area gets the image, enters enough information to decode the address, and the results are collected (tagged by the ID of the fluorescent barcode). After a few hours, this separate pile is run through the sorting machine again: this time, the fluorescent barcode ID is used to match the results from the human, and a real barcode is sprayed on the front and the piece is sorted as before.

Now, there are variations in the above, but this is the gist of it.

Fun facts: the USPS aims to handle a piece at most 7 times. And when a piece gets jammed in the machine and is torn, it gets put in a "body bag" with an apologetic note.

1 comments

Great info, thanks.

How reliable is the mail delivery? Do you know how much mail is lost? One percent, more, less? (I believe one kind of failure is called UAA - undeliverable as addressed.)

I'd love to learn more, but don't know where to start.

Some of us election integrity activists are deeply concerned with the transition to vote by mail (all postal ballots, no more poll sites). One practical complaint is our assumption that 1% of all mail is lost. In a big county like mine, that's 12,000 ballots.

My FOIA requests were rebuffed. Apparently the data gathering is done by third parties, so is considered proprietary. (A nice dodge, illustrating how privatization reduces government transparency and accountability.)

The best information I found was looking at court cases, where USPS' customers (eg bulk mailers) dispute the UAA, and don't want to pay extra.

In general, I think mail delivery is very reliable. But given the volume, there will be outliers. Even if we assumed 99.9999% reliability (a hypothetical number), given that they sort 300MM pieces per day, 300 pieces per day will be affected.

If you have the money, you should try an experiment: mail a large number of ballot-like pieces from different mailboxes all over the county (say, 10,000 letters) and see how many reach the destination. Sure, it'll cost $5K, but you may have a better answer.

The Royal Mail in the UK quotes ~99.74% reliability (for delivery, not on-time delivery), FWIW.
I'm curious how much mail also gets lost due to being delivered to the wrong mailbox. I average at least one mail per month that is not addressed to me in my mailbox.