| >Maybe bins is the wrong word to use, so I'll try with intervals Same thing, in both how you use it and how the author does. >Taking a step back, remember we're ultimately mapping these discrete numbers to some real world continuous variable I think this is where you have a misconception. There are two maps. The important one goes the other way: FROM a continuous variable TO a finite set. It's not 1-to-1: it maps entire ranges of numbers (intervals, bins, whatever) to discrete values (samples, integers, whatever). The bins are preimages of that map. The discussion in the article comes from two ways of defining that map: FROM continuous signal TO discrete variable. The map that goes the other way, from the integers into floats, has to be CONSISTENT with it. The article presents this backwards, putting the cart before the horse. This causes confusion. >All we know is that those are the numbers we have available to represent the real values we're measuring. Each of those numbers doesn't represent any one value; it represents a range. Think about it this way: if we have a continuous signal that we're discretizing into a finite number of bits, we're invariably smashing ranges into single values (what you call "rounding error"). When we're reading this data — say, we read number 5 — we don't know which continuous variable value it came from. To display it on a screen, we make a choice; we pick some number from the interval it came from, and call it a day. >The important part is that 0 represents the minimum and 1,3, and 7 all represent the same maximum real value The important part is that this is a choice you make about what those point samples represent. It's a convenient choice. Which is why we all use it. Some people prefer a different choice, that's all. > If you normalize it by 4, you get [0, 1/4, 2/4, 3/4] That's one way to do it, and not the way the article uses (re-read my previous comment, it has both). Still, I'm with you here. >and you're effectively throwing away some of the range of the ADC. The map you're describing (FROM discrete INTO continuous) is approximating the DAC. So, yes, with this scheme you're never getting 0.0 and 1.0. Think of it this way. Say, you convert an image to a 1-bit representation, and render it on a screen in grayscale. One choice is to render 0 as 0.0 and 1 as 1.0 (black and white). Another is to render 0 as 0.25 and 1 as .75 (dark grey and light grey). That's the "alternative" (divide by 2^n) approach. The formula here is x→ (x + 0.5)/2^n. Neither is inherently wrong or better than the other; especially when you ask which rendering is closer to the original image. Plus: one man's "you're not using the entire range of DAC" is another's "you leave a tiny bit of headroom". In any case, you're not losing data in either [ discrete → continuous → discrete ] chain because you get the discrete values back perfectly. What you divide by in the first step is dictated by what you do in the second. >If you normalize 2 bit data by 3 you get [0, 1/3, 2/3, 1]. Let's see what this says about how we should go in the other direction to be consistent with this scheme. Which continuous values get sent to 0 and 3? Which get sent to 1 and 2? You wrote : {0, 1, 2, 3} → [0, 1/3, 2/3, 1] So you can see that going in the other direction (discretizing): 0 ← [0 ... 1/6)
1 ← [1/6 ... 5/6)
2 ← [3/6 ... 1/6)
3 ← [5/6 ... 1 ]
Some people don't like that 0 and 3 get smaller ranges than the rest.>So the answer is, if you have N bit data, you normalize by 2^N-1. The answer is: it doesn't matter in practice, so use what's simpler in your context. That's going to be dividing by 2^N - 1 for pretty much everyone. |