Hacker News new | ask | show | jobs
by PabloSichert 2304 days ago
Hm, that's an interesting choice. I also wonder why they didn't use an encoding that has an alphabet with 2^n characters (that way you can directly map 1 charater to n bits when decoding).

Coincidentally, I've just finished some work on a project[1] that is in the same space (identifiers for images).

For the reasons you pointed out, I found Douglas Crockford's base32[2] encoding to be a good fit.

[1] https://github.com/pablosichert/ciid

[2] https://www.crockford.com/base32.html

1 comments

That choice is explained in the README: https://github.com/woltapp/blurhash

(83 is about as many safe characters as you can reasonably find, and it allows some nice ways of packing values together.)

Could you explain what "AC components" refers to? I couldn't figure that out just by reading your README.

I wonder how the efficiency compares to just encoding on the bit level.

It is a term often used for DCT-transformed data. DCT, in this case, breaks the image down into basically an average colour of the whole image, referred to as the DC component, and a bunch of waves that make up the detail of the image, referred to as AC components.

https://github.com/woltapp/blurhash/blob/master/Algorithm.md

> Could you explain what "AC components" refers to?

All but the first component of the Fourier transform. (The first component is the average of the data.) The term comes from electrical engineering, but Fourier transform has lots of applications also outside of electrical engineering.