| 1/sqrt(x) is complicated. Imagine instead of computing 1/sqrt(x), imagine instead that you wanted to compute exp_2(-.5 log_2(x)). Also imagine you have an ultra fast way to compute exp_2 and log_2. If you have an ultra fast way to compute exp_2 and log_2, then exp_2(-.5 log_2(x)) is gonna be fast to compute. It turns out you do have an ultra fast way to compute log_2: you bitcast a float to an integer, and then twiddle some bits. The first 8 bits (after the sign bit, which is obviously zero because we're assuming our input is positive) or whatever are the exponent, and the trailing 23 bits are a linear interpolation between 2^n and 2^(n+1) or whatever. exp_2 is the same but in reverse. You can simply convert the integer to floating point, multiply by -.5, then convert back to integer. But multiplying -.5 by x can be applied to a floating point operating directly on its bits, but it's more complicated. You'll need to do some arithmetic, and some magic numbers. So you're bitcasting to an integer, twiddling some bits, twiddling some bits, twiddling some bits, twiddling some bits, and bitcasting to a float. It turns out that all the bit twiddling simplifies if you do all the legwork, but that's beyond the scope of this post. So there you go. You've computed exp_2(-.5 log_2 x). You're done. Now you need to figure out how to apply that knowledge to the inverse square root. It just so happens that 1/sqrt(x) and exp(-.5 log x) are the same function. exp(-.5 log x) = exp(log(x^-.5)) = x^-.5 = 1/sqrt(x). Any function where the hard parts are computing log_2 or exp_2 can be accelerated this way. For instance, x^y is just exp_2(y log_2 x). Note that in fast inverse square root, you're not doing Newton's method on the integer part, you're doing it on the floating point part. Newton's method doesn't need to be done at all, it just makes the final result more accurate. Here's a blog here that gets into the nitty gritty of how and why it works, and a formula to compute the magic numbers: https://h14s.p5r.org/2012/09/0x5f3759df.html |