Hacker News new | ask | show | jobs
by bogomipz 2242 days ago
>"He doesn't mention why biased notation is used (i.e. why the exponent is stored as 127+E): it's used so that if you sort positive numbers as if they were integers, they'll still end up in the right order."

Could you elaborate on this? Maybe an example? This sounds interesting but I'm failing to grasp it.

5 comments

https://wiki.c2.com/?IeeeSevenFiftyFour:

“IEEE 754 […] Has the interesting and useful property that two's complement comparisons of the underlying bit pattern of any two IEEE 754 numbers will have the same result as comparing the numbers that are represented”

That means that, if you interpret the bits of a float/double as an int32/int64, increase that integer by one, and then interpret the bits of the result as a float/double, you get the smallest float/double that’s larger than what you started with (with exceptions for NaN, infinities, +/- zero, and, possibly, some categories I forget)

That can be useful if you want to iterate over all floats between 0.0 and 1.0, for example, but may not be that efficient on some modern hardware, where moving data from the “float side” of the CPU to the “integer side” is expensive)

This has more to do with the mantissa than the biased exponent.
No, it has to do with bothequally.

Changing a 0 bit to a 1 in an integer increases its value, and by the rule I gave, should also increase the value of the floating point number with the same bit pattern. It doesn’t matter whether that flipped bit is in the mantissa or the exponent.

That requires the use of the biased number in the exponent. In particular, the “all zeroes” bit pattern for the exponent must be the representation for the lowest possible exponent, and the “all ones” one that for the highest. It cannot be the ‘normal’ two’s complement representation of -1.

This is a different claim than the one above, which was about incrementing the least significant bit.
But it trivially follows from it. If operation foo (increasing x by one) makes something larger, repeated operation of foo (increasing x by 2^n) makes it larger, too.
The first thing to notice is that a (not subnormal) floating point number can be compared as such: you look at the exponent bits, and pick the one that has the greater exponent. If they are equal, then look at the mantissa, which is always between 1 and 2, and pick whichever one is larger. Since the exponent comes before the mantissa bits in the bit representation of a floating point number, it has "higher significance" when interpreted in fixed point. If we used two's complement to represent the exponent negative exponents would "look larger" than positive ones, despite them being smaller. But if we used a biased representation, where we shift the range so that there can only be a positive number in that field, and the smaller negative exponents can actually be small and the positive exponents are be larger than those so we can compare them directly.
Thanks these explanations were all really helpful. Cheers.
The exponent is 8 bits. That translates to 0 through 255. This means your "window" begins from [1,2], [2,4], etc.

That sucks. It means I cannot represent any number between 0 and 1.

Ideally, we want the exponent to be somewhat symmetric, which in this case means going from -127 to 127. To translate 0 to 255 to that interval, you subtract 127. You are simply shifting your exponent so that the allowed values for your exponent is symmetric about 0.

Now your window starts with [2^-127, 2^-126] and so on.

(I may have an off by one error here).

Sure, I had to wrestle with this a bit myself: to make it easier, imagine a 16-bit floating point format (P&H call this the "Nvidia format", but I can't find that documented anywhere but there): 1-bit sign, 5-bit (biased) exponent, 10-bit mantissa. One thing that TFA leaves out about floating point mantissas is that there's an implicit leading 1, so a 10-bit mantissa of 1111100000 would be interpreted as (binary) 1.1111100000 or 1 + 2^-1 + 2^-2 + 2^-3 + 2^-4 + 2^-5 = 1.96875. So, take the mantissa, convert it to a fraction, add 1, and then raise it to the power of the (biased) exponent.

So now, take the 16-bit pattern (0000 0011 1110 0000); you get a 0 sign bit, an exponent of 0, and a mantissa of 1.96875. So, with a bias of -15, that's 1.96875 * 2 ^ 0-15 = 0.00006008148193359375.

As you creep up to the "next" exponent, you see that the boundaries are respected. The last 2^-15 number is 0000 0011 1111 1111 (0x3ff) and the first 2^-14 number is 0000 0100 0000 0000 (0x400); now the exponent has changed from 0 to 1, but the floating point converts to 1.999023 * 2 ^ -15 = 0.0000610053539276123046875 and 1.000000 * 2 ^ -14 = 0.00006103515625: a bit-by-bit comparison has 0x3ff < 0x400. (This would be true regardless of bias).

Now imagine that IEEE 754 stored exponents in two's-complement format instead; the exponent 01111 would be interpreted as +31, but the "next" exponent, bit-wise, would be 10000 = -32. This means that you'd end up with 0011111111111111 = 1.999023 * 2 ^ 31 = 4292869204.475904, but the next binary number, 0100000000000000 would be 1.0 * 2 -32 = 0.00000000023283.

> imagine a 16-bit floating point format (P&H call this the "Nvidia format", but I can't find that documented anywhere but there): 1-bit sign, 5-bit (biased) exponent, 10-bit mantissa

According to Wikipedia (https://en.wikipedia.org/wiki/Half-precision_floating-point_...), this is the IEEE 754 standard binary16 format.

Sort of - I skipped over subnormal numbers.
Well, really the reason for the bias is so that you can get a negative exponent. If E = 10, then the exponent will be 2^-117. You might still need a negative exponent with a positive sign. The offset is used rather than two's complement maybe because of that sorting thing but also because of the math itself, when doing operations on two FP numbers.