| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by boywitharupee 636 days ago
	so, these are hand optimized primitives for specific model of nvidia gpus? do you still have to make launch/scheduling decisions to maximize occupancy? how does this approach scale to other target devices with specialized instruction sets and different architecture?