Hacker News new | ask | show | jobs
by seg_fault 606 days ago
Actually you can specify the numeric limits of the mantissa and the exponent. They can be specified as template arguments[0]. So you could do:

      Float<uint8_t, // type of the mantissa
            uint8_t, // type of the exponent
            0,       // lowest possible value of the mantissa
            4095,    // highest possible value of the mantissa
            0,       // lowest possible value of the exponent
            7>       // highest possible value of the exponent
The Float then simulates an unsigned 12bit mantissa and a 3bit exponent. Sure it still takes 16 bytes. But you could create a union with bitfields where you shrink that even further.

[0] https://github.com/clemensmanert/fas/blob/58f9effbe6c13ab334...

1 comments

Can you go in the other direction? Higher exponent and mantissa than regular float/double?
Sure.

    Float<int64_t, int64_t>
Gives you a signed Mantissa with 64 bit and a signed Exponent with 64bit. Since there are numeric limits for int64_t available, Float knows the max and the min value.

You could get even bigger ranges for Float by implementing your own big integer type.