|
|
|
|
|
by Alexander-Barth
107 days ago
|
|
Are you using this code for Julia? https://github.com/JuliaParallel/rodinia/tree/master/julia_m... It was touched 9 years ago, but maybe you have ported it to current standards. I don't think we had multithreading at that time, only multiprocessing. Is your Julia implementations available somewhere? (Sorry if it is in your paper but I missed it).
I vaguely remembered in the past that working with threads leaded to some additional allocations (compared to the serial code). Maybe this is also biting us here? |
|
As far as I know the code was ported to use @floops, with minor optimisations in addition to that.
I think it's quite possible that it's an allocation issue, that's something we're looking into, although I don't have any specific results for Julia yet.