|
|
|
|
|
by winternewt
779 days ago
|
|
Even so, if you wrap it in a function and make sure to document what's happening, I would argue that a version without asm() is preferable. It gives the compiler more leeway for optimization and is easier to read for someone who isn't well-versed in GCC's weird asm syntax. See the generated code for these two alternative implementations: https://godbolt.org/z/3vno6G46j |
|
With that fixed, it's the same: https://godbolt.org/z/M571P371K
But after staring at this for a little while, I realized simply reversing the order of the fields in the dword struct will make the result from the multiply instruction already match the way 128-bit structs are returned in registers under the x86_64 calling convention! This is quite a bit better: https://godbolt.org/z/MbfG63vej
Also worth noting the asm makes nicer code on arm64: https://godbolt.org/z/7eas6s9vK
https://github.com/jcalvinowens/toy-rsa/commit/59ef9ea905dbd... https://github.com/jcalvinowens/toy-rsa/commit/ceaa8a4dd0834...