| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by totalperspectiv 416 days ago
	In the coarse graining code, you use an @parameter-for. Doesn’t that lead to some pretty large code size unrolling that? Or is that less of an issue on GPU? Great write up! I learned a lot!

1 comments

simon_vtr 416 days ago

It doesn’t. The batch size is just 8. This is a very good trick and often needed to archive peak performance in memory bound kernels. You can checkout the equivalent code in cuda aswell :)

link