Hacker News new | ask | show | jobs
by mananaysiempre 329 days ago
Somewhat incidentally, is there an actual description of how a low-tech QR code reader would work? I’ve looked for this a few years ago and all solutions I could find were of two flavours: (1) use ZXing (“Zebra Crossing”, a now-unmaintained library[1] for every 1D and 2D barcode under the sun); (2) use OpenCV. Nowhere could I find any discussion of how one would actually deal with the image-processing part by hand. And yet QR codes are 1994 tech, so they should hardly require fancy computer-vision stuff to process.

[1] https://github.com/zxing/zxing

5 comments

You can roughly divide barcode reading into a "frontend" and a "backend". The backend is the most well understood (but not necessarily trivial) part: you take a binary image, with each pixel corresponding to one little square in the QR code, and decode its payload. It doesn't need computer vision. The "frontend" is the part that takes the raw image containing the barcode and tries to find the barcode, and convert the barcode it finds into a nice, clean binary image for the backend. This is a computer vision problem and you can arbitrarily fancy, including up to using the latest trends in ML vision models. However, this isn't necessarily needed in most cases; after all, barcodes are designed to be easy to read for machines. With a large, sufficiently well focused and well exposed image of a barcode you can get away with simple classical computer vision algorithms like histogram-based binarization and some heuristics to identify the spatial extent of the barcode (for example, most barcode symbologies mandate "quiet space" (blank space) to be around the barcode, and have start and stop markers; QR codes have those prominent concentric squares on the corners).

As for implementation, Zxing-cpp [1] is still maintained, and pretty good as far as open source options go. At this point I'm not sure how related it is to the original zxing, as it has gone substantial development. It has python bindings which may be easier to use.

On mobile, Google MLkit and Apple vision also have barcode reading APIs, not open source but otherwise "free" as in beer.

[1] https://github.com/zxing-cpp/zxing-cpp

I think many places roll their own. QR codes have a finder pattern of alternating black and white with relative spacing of 1-1-3-1-1 for any line that goes through the center of it. This has a low false positive rate so this step is the most important for performance. Since orientation doesn’t change the pattern horizontal scans are sufficient, once the center, scale, and rotation has been found the rest is fairly straight forward. There isn’t really a good reason why this set up couldn’t be rotation invariant, the processing power required is pretty low.
You could have a look at the ISO spec on QR Codes to get the authoritative source on what kind of processing required.

Alternatively (and what I would recommend), is grab a library that is dedicated to QR Code reading (I‘ve used Quirc, for example) and just read the code.

Typically you threshold an image to get a binary representation (1: black, 0: white). Then you detect the finder patterns by looking for 1-1-3-1-1 runs of black/white. Once you have a bunch of finder patterns localized, you form triplets and decode the binary matrix that they span.

Using https://duckduckgo.com/?q=qrcode%20decode%20by%20hand gives me a number of results like

http://blog.qartis.com/decoding-small-qr-codes-by-hand/

https://beck-thompson.github.io/QRcodewebsite/

Reading QR codes without a computer! https://qr.blinry.org/

And then there are a number of OSS implementations for Android (check fdroid.orgl, Debian/Ubuntu, etc. Maybe study the code?

That’s not what I’m asking about: once you’ve found the QR code in the bag of pixels you got from your camera and converted it to a boolean array of module colours, then yes, all you have left is a bit error-correction math and some amusingly archaic Japanese character encoding schemes—definitely some work, but ultimately just some work. (For that matter, the Wikipedia article on QR codes contains enough detail to do this.)

What has thus far remained a mystery to me is going from a bag of noisy pixels with a blurry photo of a tattoo on a hairy arm surrounded by random desk clutter to array of booleans. I meant “by hand” as in “without libraries”, not “using a human”, as in the latter case the human’s visual cortex does the interesting part! And the open-source Android apps that I’ve looked at just wrap ZXing, which is huge (so a sibling commenter’s suggestion of looking at a different, QR-code-specific library is helpful).

You can examine the code of zxing-cpp (which is fairly nice IMO) for a simple, "classical computer vision" approach to this. It's not the most robust implementation but it is pretty functional.

But in general, you can divide the problem more or less like this (not necessarily in this order) 1. find the rough spatial region of the barcode. Crop that out and only focus on this 2. Correct ("rectify") for any rotation or perspective skew of the barcode, turn it into a frontoparallel version of the barcode 3. Binarize the image from RGB or grayscale into pure black and white 4. Normalize the size so that each pixel is the smallest spatial unit of the barcode.

> What has thus far remained a mystery to me is going from a bag of noisy pixels [...] to array of booleans.

Ah, OK. You can use software like ImageMagick to partition images into levels of gray or just black and white. I have some examples somewhere I played with some time ago, but not accessible online right now, sorry. If the contrast of the original image is high enough, just the qrcode would remain to be parsed.

Here's an example to start with: ImageMagick's -threshold or -adaptive-threshold (depending on the image's lighting) are what you want to look at, e.g. try something like

    magick input.jpg \
      -colorspace Gray \
      -adaptive-threshold 8x8+10% \
      candidate.png
And if you wanna go even lower, you'll "raw" read the image pixel by pixel to normalise colour to black/white and then read that matrix for the QR pattern.

And then, cause there are some inverted colour QRs, flip and scan again.

There's zxing-cpp that is a fork of zxing and actively maintained: https://github.com/zxing-cpp/zxing-cpp