|
|
|
|
|
by ant6n
1223 days ago
|
|
Mmh, compression ratio doesnt seem super high. I wonder how the values are stored, perhaps floats? Some thoughts: Maybe using yuv would be better than rgb? Maybe values could be represented as 0..1, and the values itself could be stored as 0.16 fixed point numbers. Use some generic compression on top. Is there a smart way to store the basis vectors? Something about angles somehow (analogue to quaternions)? Also, once U is known, isnt the inverse implied? Also, if U is approximated and fixed, perhaps the singular values can be adjusted again to minimize errors. |
|
Consider, instead, compressing small blocks of the image instead of a whole image plane at once. Let's say 8x8.
Over such a small region, pixels are usually very highly correlated. You could model them statistically as an order-1 autoregressive process with a high correlation coefficient (e.g., x[i + 1] = x[i]*rho + noise, with rho >= 0.95 usually, separately in both the horizontal and vertical directions).
Then, computing the eigendecomposition of the 8x8 cross-correlation matrix C_{ij} = rho^|i - j| produces a set of common eigenvectors that you can use for all blocks (recall that the left and right singular vectors U and V of a matrix M are just the eigenvectors of M.M^T and M^T.M, respectively). So now instead of several very large vectors, you only need a single 8x8 matrix of basis vectors (because it's square and orthonormal, the inverse is just the transpose).
But wait. Let's take the limit as rho -> 1. It just so happens that the resulting eigenvectors converge to the Type 2 Discrete Cosine Transform [0]. So, in fact, you don't need to transmit the basis at all.
You can store the output of the transform as fixed point numbers with some arbitrary base (the choice controls the amount of compression: anything too small just gets rounded to zero). Maybe use run-length encoding as your generic compression, because you expect to have a lot of zeroes (visit them in a zig-zag order, to get the big values first and maximize the length of the runs of zeroes), with some Huffman coding of the actual symbols. Add a simple predictor for the (0,0) coefficient of each block (it winds up being the average of the pixel values, so it should be pretty similar from block to block).
Congratulations, you just invented JPEG.
[0] http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1672...