|
|
|
|
|
by lntue
1110 days ago
|
|
So in the implementation of `cos_table__LERP`, you did technically 2 step range reduction: 1. Reduce x = x mod 2pi 2. Reduce index = 10^n * (x / 10^-n), and i - index = 10^n * (x mod 10^-n) With limited input range and required precision as in the tests, you can combine these 2 range reduction steps: 1. Choose the reduced range as power of 2 instead of power of 10 for cheaper modulus operation, let say `2^-N = 2^-7`. 2. Avoid the division in `modd(x, CONST_2PI)` by multiplying by `2^N / pi`. 3. Avoid the round trip `double -> int -> double` by using the `floor` function / instruction. Here is the updated version of `cos_table__LERP` which should have higher throughput and lower latency: ```
double cos_table_128_LERP(double x) {
x = fabs(x);
double prod = x TWO_TO_SEVEN_OVER_PI; }
``` |
|