|
|
|
|
|
by eigenspace
1830 days ago
|
|
> Regarding your performance work, you've nerd sniped me into looking for analytical tricks to speed it up ;) We'll see... Looking forward to it! These microbenchmarks are very fun to explore. > Side note: any idea why the python and fortran are exp(i k... while the julia is exp((100 +i) i...)? Is it something I overlooked? Oh, that is a partially applied edit I guess. When the author posted on the julia forum, people pointed out that since the exponent was pure imaginary, it could be speeded up even more with cis(...) instead of exp(im * ...) but then the author claimed that exp((100 + im) * ...) was more representative of his actual workflow and I guess changed the julia version in his blogpost but not the Python or Fortran versions. |
|
https://colab.research.google.com/drive/1ABrZJlm8pwB6_Sd6ayO...
On my macbook, using XLA's jit in python gave about a 12-15x speedup on CPU over OP's solution, which was pretty cool, but I'm too lazy to figure out how to install and benchmark Julia on my machine. Applying a 12-15x speedup would at least beat the Julia MT solution in OP, and you've got to admit `exp(CONST * sqrt(A**2 + A.T**2))` is a pretty clean way to do it.
Then I ran on whatever GPU colab decided to give me (a P100), and for just adding a decorator, it's a 1000x-1900x speedup (better as n goes up). Hence my current honeymoon period with jax. I love the speed vs readability tradeoff.