|
|
|
|
|
by lifthrasiir
1703 days ago
|
|
Ugh, you were correct. I did copy and paste your code to my testing framework and it instantly crashed at that time, but it seems that I put a wrong offset to the output. The resulting code was slightly faster (by 2--4%) than my AVX2 code. |
|