| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by lntue 1110 days ago

So in the implementation of `cos_table__LERP`, you did technically 2 step range reduction:

1. Reduce x = x mod 2pi

2. Reduce index = 10^n * (x / 10^-n), and i - index = 10^n * (x mod 10^-n)

With limited input range and required precision as in the tests, you can combine these 2 range reduction steps:

1. Choose the reduced range as power of 2 instead of power of 10 for cheaper modulus operation, let say `2^-N = 2^-7`.

2. Avoid the division in `modd(x, CONST_2PI)` by multiplying by `2^N / pi`.

3. Avoid the round trip `double -> int -> double` by using the `floor` function / instruction.

Here is the updated version of `cos_table__LERP` which should have higher throughput and lower latency:

``` double cos_table_128_LERP(double x) { x = fabs(x); double prod = x TWO_TO_SEVEN_OVER_PI;

```