| HN Mirror

> Most aren't in those regions.

If you're only looking at places where posit is worse than float, and ignoring the places where it's better, you should use a posit with a longer exponent where none of the values lose more than a single bit of precision. (There will be low-precision values, but they will be numbers that are completely unrepresentable in the equivalent float)

> Really? When it shows up in spreadsheets in various forms I expect people will think otherwise. The Intel floating point fiasco and Excel bugs with vastly less error certainly caused major problems.

Surely 2257 vs. 2258 is roughly as bad? But float16 has the exact same problem with those numbers. It's not "crazy bad" that the threshold is in a slightly different spot. 16 bit numbers are not appropriate for spreadsheets no matter what format.

> Both overflow - float16 to inf (denotes overflow, correct), posit16 to NaR (not a real, incorrect - the result is a real number). float16 also then says 0 < inf, correct. Posit16 says NaR < 0, which is not correct.

posit16 most certainly does not overflow on 400x400. It doesn't overflow on 4000x4000 either, or 4 million x 4 million.

And if you treat "infinity" literally it's not correct at all. If you're going to say "denotes overflow, correct" then it's only fair to say the same thing about NaR. Pretend it's "not a result" maybe? It's just a name.

> All of the IEEE formats - multiplying or dividing by two never loses precision until overflow or underflow since it's merely incrementing/decrementing the exponent.

Let me rephrase. What 8 bit float can represent 1.03125 in the first place?

> Also why did you ignore the 10*2 = 16 posit case? And these happen for all size posits for reasonable ranges, but none of the IEEE formats. The values I gave were for examples. Run your own checks and you'll find them all over the map for any size posit.

There is no standard 8 bit IEEE format as far as I know.

I didn't ignore that case. I pointed out how the standard posit8 fails it. But if we're using some kind of weird custom float with no sign bit, I think it's valid to use a better-balanced posit. Posit8_1 is the best competitor to a custom float with 5 mantissa bits. If you also remove the sign bit, then it can do 10 * 2 = 20.

> It's a fundamental property of IEEE 754 that the difference (or sum) of two representable values is itself representable in the same IEEE 754 format

Except when you hit infinity.

> Examples where the error in addition is not representable are a= Posit16(0x0001)=2^-114, a+a should be 2^-113, not representable as posit16, a+a as posit 16 gives 2^-112, with error from correct of 2^-113, not representable.

1. Such aggressive rounding only happens in the last couple values next to 0. If you were using float16 you wouldn't be able to represent that value, it would just be 0.

2. Does that give you x - y = 0 for different x and y, the thing I asked about?

> Why do you think that is?

I have no idea how hard it is to implement, to be honest. But adding a few more bits is easy in comparison. And it's not that different from normal floating point, so who wants to bother?

bfloat is just truncating differently; I'm not surprised it was very quickly implemented.