| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by derf_ 1269 days ago

> Is there a smart way to store the basis vectors?

Consider, instead, compressing small blocks of the image instead of a whole image plane at once. Let's say 8x8.

Over such a small region, pixels are usually very highly correlated. You could model them statistically as an order-1 autoregressive process with a high correlation coefficient (e.g., x[i + 1] = x[i]*rho + noise, with rho >= 0.95 usually, separately in both the horizontal and vertical directions).

Then, computing the eigendecomposition of the 8x8 cross-correlation matrix C_{ij} = rho^|i - j| produces a set of common eigenvectors that you can use for all blocks (recall that the left and right singular vectors U and V of a matrix M are just the eigenvectors of M.M^T and M^T.M, respectively). So now instead of several very large vectors, you only need a single 8x8 matrix of basis vectors (because it's square and orthonormal, the inverse is just the transpose).

But wait. Let's take the limit as rho -> 1. It just so happens that the resulting eigenvectors converge to the Type 2 Discrete Cosine Transform [0]. So, in fact, you don't need to transmit the basis at all.

You can store the output of the transform as fixed point numbers with some arbitrary base (the choice controls the amount of compression: anything too small just gets rounded to zero). Maybe use run-length encoding as your generic compression, because you expect to have a lot of zeroes (visit them in a zig-zag order, to get the big values first and maximize the length of the runs of zeroes), with some Huffman coding of the actual symbols. Add a simple predictor for the (0,0) coefficient of each block (it winds up being the average of the pixel values, so it should be pretty similar from block to block).

Congratulations, you just invented JPEG.

[0] http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1672...