| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by xenonite 4664 days ago
	On the contrary. Consider that the memory is the bottleneck when performing the blur. I understand you would create five instances of the images (50%, 12.5% Left, R, Top, B). This would worsen the bottleneck even more. Additionally, the advantage of computing pixel by pixel is that the shader can operate massively parallel.

2 comments

jerf 4662 days ago

I think I wasn't clear. I mean, send 5 textured polygons to the 3D hardware, with various darkenings and offsets on one texture (which if nothing else can be done via lighting, but there's probably other easier ways), and let it do the blending en masse. Instead of using shaders and blinding it to what you're doing, it may be able to render the polygons much faster, on an optimized path. And it may not. But it's worth a try.

link

sliverstorm 4663 days ago

I would think it is going to depend on your cache size. Piecewise will be better if the image can't all fit in the cache, but if the image is small enough you can fit everything in the cache.

Or, do you mean that memory is the bottleneck as in, shipping the image to the GPU's memory space?

link

xenonite 4663 days ago

Yes, it really depends on your cache size.

And the problem with that is, you can't guess the cache size. You can help yourself with profiling, but this leads to a local optimization for only some GPUs.

If you wish to run your code optimized for any GPU, the pixel-by-pixel approach usually works best. Then, the GPU scheduler can run as many neighboring threads as possible in subprocessors. Note that every subprocessor has another local cache which is really quick.

link