Hacker News new | ask | show | jobs
by stabbles 2764 days ago
Very roughly without knowing details: probably the Julia implementation can be tuned to get close to 3 GFLOPS; it's not that the language has limitations to get above this 100MFLOPS whereas in Python 4MFLOPS might potentially be the best you can get.

Care to share your code and see if it can be improved upon?

1 comments

I think the main performance bottleneck is that I'm adding to a submatrix. Which seems to be a big performance hit in basically all high level languages.