| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by amkkma 2096 days ago
	The GPU gap is only if written in the high level index or loop style. There is little to no gap if done either using array abstractions (broadcast, map etc) or at a level similar to Cuda C (though with nicer Julia abstractions and syntax): https://juliagpu.org/cuda/ The Julialab at MIT is working on making the higher level codegen faster

1 comments

MiroF 2096 days ago

I guess that makes sense to me.. you can just automatically convert the C in BLAS to Julia and then if they're both being converted to llvm ir by clang anyways than i guess it'll be about as fast!

link

amkkma 2095 days ago

That's not at all what Julia is doing. It's much more sophisticated in that it has very low level intrinsic primitives that can compose and it optimizes the IR to make it fast and then compiles it to CUDA. These all map to Julia constructs.

link