Hacker News new | ask | show | jobs
by borramakot 1792 days ago
Given the author mentions multiple cores being available, I'd guess you could use any method, including MPI, to distribute the computation. But whether you used 1 core or 10k cores, it would be nice to have a 20x speedup on each core via this arithmetic/fixed size optimization. Since that's the focus of the article, communication technologies feel pretty unrelated.