If you are ready to spend some precomputation time to compute a good approximation, you can use the Remez algorithm [1]. It is implemented in the Sollya library for machine precision [2,3]. It has notably been used to implement the Core Math library [4] to provide correct rounding for the math functions in the libc library.