| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by londons_explore 3388 days ago

This seems to be optimizing for a "perceptual loss function" over in https://github.com/google/butteraugli/blob/master/butteraugl...

Looking at the code to that, it looks like 1500 lines of this:

    double MaskDcB(double delta) {
      PROFILER_FUNC;
      static const double extmul = 0.349376011816;
      static const double extoff = -0.894711072781;
      static const double offset = 0.901647926679;
      static const double scaler = 0.380086095024;
      static const double mul = 18.0373825149;
      static const std::array<double, 512> lut =
          MakeMask(extmul, extoff, mul, offset, scaler);
      return InterpolateClampNegative(lut.data(), lut.size(), delta);
    }

The code has hundreds of high precision constants. Some even seem to be set to nonsensical values (like kGamma to 0.38) Where did all of them come from? The real science here seems to be the method by which those constants were chosen, and I see no details how it was done.

4 comments

londons_explore 3388 days ago

Upon more investigation, these numbers are certainly machine generated. Here is an example:

A constant lookup table is used for determining the importance of a change vs distance. Seperate tables are used for vertical and horizontal distances (I guess eyes might be slightly more sensitive to vertical edges than horizontal ones?).

Those tables are wildly different in magnitude:

    static const double off = 1.4103373714040413;  // First value of Y lookup table
    static const double off = 11.38708334481672;   // First value of X lookup table

However, later on, when those tables are used, another scale factor is used (simplified code):

    static const double xmul = 0.758304045695;
    static const double ymul = 2.28148649801;

The two constant scale factors directly multiply together, so there is no need for both. No human would manually calculate to 10 decimal places a number which had no effect. Hence, my theory is these numbers have been auto-generated by some kind of hill climbing type algorithm.

link

cjhanks 3388 days ago

Yeah, this looks like an optimizer wrote the program. I presume the code was tested against natural images... so it might not be appropriate for all image types.

link

divbit 3388 days ago

See fig. 2.1 here:

http://disp.ee.ntu.edu.tw/meeting/%E7%B6%AD%E6%AF%85/An%20In...

and also read here:

https://en.wikipedia.org/wiki/YUV

That is my quick guess on how to roughly derive the constants (because it is new, probably there are some fancy modifications tho :) )

link

onurcel 3388 days ago

This kind of constant appears naturally when you approximate some computation, a famous example being Gaussian quadrature (look at x_i values depending on the "precision" you want : https://en.m.wikipedia.org/wiki/Gaussian_quadrature )

I don't know if this code is related to that but just pointing out that seemingly nonsensical constants may appear more than one would thing.

link

nothis 3388 days ago

So... machine learning? (Sorry for buzz-wording)

link

JyrkiAlakuijala 3388 days ago

It is old school: 100000+ cpu hours of Nelder-Mead method (+common tricks) to match butteraugli to a set of 4000 human rated image pairs created with an earlier version of Guetzli and specially-built image distortion algorithms.

link

londons_explore 3388 days ago

How did you protect against overfitting? How about local maxima? Some of your constants look surprising to say the least.

Most notably:

* The gamma value of 0.38 (when most studies suggest 1.5 - 2.5 for human eye gamma)

* The significant difference in the vertical and horizontal constants (when as far as I know human eyes are equally sensitive to most distortions independant of angle).

link

JyrkiAlakuijala 3388 days ago

There was a large variety of regularization and optimization techniques used. An embarrassingly large amount of manual work went into both.

In this use the gamma is the inverse of something close to 2.6. Butteraugli needs both gamma correction and inverse gamma correction.

The FFT co-efficients only look weird, but they should actually lead to a symmetric result if our math is correct. In a future version we move away from the FFT, so I don't encourage anyone to actually debug that too much.

link

i336_ 3387 days ago

> In a future version we move away from the FFT

Do you have any idea when that will land in the open? Say, before 2018? Or maybe a little sooner?

link

i336_ 3386 days ago

Very confused as to why this was downvoted?

link

sqeaky 3387 days ago

Nothing to be embarrassed about, you got results. Results are what matters in the end.

This is even more the case with software. If this cost 20 or 20 million man hours is irrelevant in the long run because eventually it will be used enough to offset costs if it is as good as every one says. Of course short term cost hurt now, but sounds like you still had computers do much of the heavy lifting.

link

myle 3388 days ago

To those who didn't notice, that's one of the authors, well also took large part in previous related work.

link

i336_ 3387 days ago

Very nice.

One question: as the top-level comment in this thread noted, this algorithm may be specialized to certain kinds of images.

Can you release info about the kind of images/datsets this approach will pathologically fail with? That would be really really awesome.

link

rdtsc 3388 days ago

I've seen this done in signal processing domain - someone goes to Matlab, creates a filter or other transformation there and then presses a button and it spits out a bunch of C code with constants looking like that. So they probably did that same thing.

link