Hacker News new | ask | show | jobs
by jessetemp 17 days ago
The author is confusing bins with bin edges. In their first plot, the standard approach looks strange because 0-7 should be the bin edges, not the center points as shown in the plot.

You can see this confusion again in the histogram example. There are only 255 bins, not 256. If you fix that mistake and remove the 0.5 offset, then the histogram is distributed correctly at both ends.

4 comments

2*8 = 256. You can represent 256 distinct values, bins, with an 8 bit number. If you stick a 0 in that first one, it takes a bin. If you fill the rest with by-one increasing integers, then the max value will be 255, thus the 2*bits - 1, which is the max value you can store.
No, the author understands the problem way deeper than you do.

You haven't grasped the fact that the choice isn't obvious, and has subtle trade-offs.

If you don't believe the author, check the other posts he references.

Judging by your other comment in this thread, you might agree with my rational [1] more than you realize

[1] https://news.ycombinator.com/item?id=48365800

How do you fit 256 distinct values into 255 bins?
By counting the edges
I see what you’re saying - index 0 holds values from 0-1, index 2 from 1-2 etc, but then you have index 255 holding values between 255 and 256. So you’re sort of arguing that the 0-255 8-bit quantization is actually representing ‘real’ values of 0-256?…

Edit: somehow missed alterom’s reply - they explain it much better than my question above does.

Not quite. I'm saying there are 256 discrete numbers (0-255) and 255 intervals between those numbers. Most of the real values will fall into the intervals and get mapped to 0-255 somehow, maybe by nearest neighbor, but I'm not trying to define how they get mapped. The point is that 255 is the largest number that can be represented with 8 bits, so you should normalize by 255.

I wrote a longer replay to alterom but it looks buried for some reason.

https://news.ycombinator.com/item?id=48365800

> index 0 holds values from 0-1, index 2 from 1-2 etc,

Well, now you are double counting the end values of the ranges. In your example 1 is included in both 0-1 and 1-2.

Sorry, you seem to be confused.

>There are only 255 bins, not 256

There are 256 bins because there are 256 values.

The questions are:

1. What are the boundaries of these bins?

2. Which sample represents a particular bin?

With 1-bit color, we have sample values {0, 1}. What bins do they represent?

Here's one choice:

     [0, 1), [1, 2)
Two equally sized bins, spanning the interval [0, 2] of length 2, each defined by its sample at lower bound.

Alternatively, we could consider these bins:

    [-0.5, 0.5), [0.5, 1.5)
These are also equally sized bins, spanning the interval [-0.5, 1.5] of length 2, defined by samples at the center.

We could also define bins like this:

    [0, 0.5), [0.5, 1]
Two equally sized bins spanning the interval [0, 1] of length 1, where we sample the first bin at the lower bound, and the last bin at the upper bound.

This, in a nutshell is what the author is trying to explain.

Let's look at this again, with 2 bits.

With 2-bit color, we have sample values {0, 1, 2, 3}.

Which bins do they come from?

The three options above yield:

    [0, 1), [1, 2), [2, 3), [3, 4)

    [-.5, 0.5), [0.5, 1.5), [1.5, 2.5), [2.5, 3.5)

    [0, 0.5), [0.5, 1.5), [1.5, 2.5), [2.5, 3]
The first two span an interval of length 4, the third spans an interval of length 3.

In the third case, the tail bins are short (have size ½), and the rest have size 1.

The last bin must be a closed interval in the third case, so that it includes the value we picked to represent it.

None of these choices is inherently invalid or better than the others; and none stems from "confusing bins with edges".

The third option does have the distinction that the first and last bins are smaller than the rest. But it's not necessarily a drawback. Especially when we're talking about color, hardware interpretation, and human perception.

When you remap these bins into the [0, 1] interval, you're "dividing by 4" in the first two cases, and by 3 in the third case.

The maps are:

     x → x/4
     x → (x + ½)/4
     x → x/3
The inverse maps (that yield a sample in {0, 1, 2, 3} given a floating point value in interval [0, 1]) are:

    x → trunc(4x)
    x → round(4x - ½) = trunc(4x)
    x → trunc(3x + ½)
In the first two options, the domain is [0, 1). It might be necessary to apply clipping because the exact value 1.0 falls outside the range of the forward transform.

The 2nd option is the most symmetric, of course, but the 3rd one is the most straightforward (and cheapest) to implement, so that's the default.

The choice amounts to making the highest and lowest bins slightly smaller to make the rest sightly larger.

That's to say, if you generate uniform noise between 0 and 1, you'll get the following samples from your function with equal probability:

    0 or 3
    1
    2
As the author points out, this hardly matters when you are talking about having 256 bins.

That, and with color specifically, the "good" histograms aren't uniform anyway (and any photographer wants to avoid getting much at either extreme).

TL;DR: The author is not confusing anything — but their diagram and explanation are, indeed, a bit confusing.

Thank you for the thoughtful reply. Maybe bins is the wrong word to use, so I'll try with intervals. Starting with 1 bit data, there are two numbers and one interval. I think where bins makes it confusing is that inside the interval there are two big rounding errors mapping everything to either 0 or 1 and many people seem to be considering those the bins.

Taking a step back, remember we're ultimately mapping these discrete numbers to some real world continuous variable like the saturation of red, frequency, mass on a scale, whatever. And our digital device can only represent a finite amount of numbers. For 2 bit data, we can represent 0-3, and for 3 bit data we can represent 0-7.

The important part is that 0 represents the minimum and 1,3, and 7 all represent the same maximum real value, and everything that can be measured by the device will fall within those ranges. So comparing 1, 2 and 3 bit data on a linear number line looks like this:

  0                    1
  0      1      2      3
  0  1  2  3  4  5  6  7
You could assume that everything gets assigned to whatever number is nearest in the number scale or come up with another scheme, but that is ultimately defined by the ADC and likely nonlinear. All we know is that those are the numbers we have available to represent the real values we're measuring.

The question is about how to normalize the data. 1 bit data is already normalized. If you normalize 2 bit data by 3 you get [0, 1/3, 2/3, 1]. LGTM. If you normalize it by 4, you get [0, 1/4, 2/4, 3/4] and you're effectively throwing away some of the range of the ADC. You can try to get it back by offsetting by 0.5 then normalizing but now you get [1/8, 3/8, 5/8, 7/8]. And you could stretch that with some clever formula to fill from 0 to 1, but if you do it right then it's the equivalent to normalizing by 3, so why not normalize by 3?

So the answer is, if you have N bit data, you normalize by 2^N-1.

>Maybe bins is the wrong word to use, so I'll try with intervals

Same thing, in both how you use it and how the author does.

>Taking a step back, remember we're ultimately mapping these discrete numbers to some real world continuous variable

I think this is where you have a misconception.

There are two maps.

The important one goes the other way: FROM a continuous variable TO a finite set.

It's not 1-to-1: it maps entire ranges of numbers (intervals, bins, whatever) to discrete values (samples, integers, whatever).

The bins are preimages of that map.

The discussion in the article comes from two ways of defining that map: FROM continuous signal TO discrete variable.

The map that goes the other way, from the integers into floats, has to be CONSISTENT with it.

The article presents this backwards, putting the cart before the horse. This causes confusion.

>All we know is that those are the numbers we have available to represent the real values we're measuring.

Each of those numbers doesn't represent any one value; it represents a range.

Think about it this way: if we have a continuous signal that we're discretizing into a finite number of bits, we're invariably smashing ranges into single values (what you call "rounding error").

When we're reading this data — say, we read number 5 — we don't know which continuous variable value it came from.

To display it on a screen, we make a choice; we pick some number from the interval it came from, and call it a day.

>The important part is that 0 represents the minimum and 1,3, and 7 all represent the same maximum real value

The important part is that this is a choice you make about what those point samples represent.

It's a convenient choice. Which is why we all use it.

Some people prefer a different choice, that's all.

> If you normalize it by 4, you get [0, 1/4, 2/4, 3/4]

That's one way to do it, and not the way the article uses (re-read my previous comment, it has both).

Still, I'm with you here.

>and you're effectively throwing away some of the range of the ADC.

The map you're describing (FROM discrete INTO continuous) is approximating the DAC.

So, yes, with this scheme you're never getting 0.0 and 1.0.

Think of it this way. Say, you convert an image to a 1-bit representation, and render it on a screen in grayscale.

One choice is to render 0 as 0.0 and 1 as 1.0 (black and white).

Another is to render 0 as 0.25 and 1 as .75 (dark grey and light grey).

That's the "alternative" (divide by 2^n) approach. The formula here is x→ (x + 0.5)/2^n.

Neither is inherently wrong or better than the other; especially when you ask which rendering is closer to the original image.

Plus: one man's "you're not using the entire range of DAC" is another's "you leave a tiny bit of headroom".

In any case, you're not losing data in either [ discrete → continuous → discrete ] chain because you get the discrete values back perfectly.

What you divide by in the first step is dictated by what you do in the second.

>If you normalize 2 bit data by 3 you get [0, 1/3, 2/3, 1].

Let's see what this says about how we should go in the other direction to be consistent with this scheme.

Which continuous values get sent to 0 and 3? Which get sent to 1 and 2?

You wrote : {0, 1, 2, 3} → [0, 1/3, 2/3, 1]

So you can see that going in the other direction (discretizing):

    0 ← [0   ... 1/6) 
    1 ← [1/6 ... 5/6)
    2 ← [3/6 ... 1/6) 
    3 ← [5/6 ... 1  ]
Some people don't like that 0 and 3 get smaller ranges than the rest.

>So the answer is, if you have N bit data, you normalize by 2^N-1.

The answer is: it doesn't matter in practice, so use what's simpler in your context.

That's going to be dividing by 2^N - 1 for pretty much everyone.

I get it now. Had to play around with the code a bit to see it. Very interesting and unintuitive problem. Thanks for the thorough replies
* Correction: fixing typo

    0 ← [0   ... 1/6)     
    1 ← [1/6 ... 3/6)
    2 ← [3/6 ... 5/6) 
    3 ← [5/6 ... 1  ]