| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by tbenst 2014 days ago
	SIMD and ie Nvidia WARP are not the same. Idk about Apple’s GPU, but for example there is no GPU alternative to the SQRTPD instruction (Square root of double precision). Also, when there is branch divergence across threads, CPUs still do a much better job than GPUs. Curious to think about how unified memory may change the ratio of flops/memory access when it makes sense to shift job from CPU (better for low number) to GPU (better for high ratio)