| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by SturgeonsLaw 3812 days ago
	> there is no workable solution Forgive my ignorance (not a graphics programmer), but why can't the drivers simply clear the buffer before handing it off to another application?

2 comments

sj4nz 3811 days ago

Pretty sure it has to do with benchmarks and the cut-throat competitive environment GPU-manufacturers exist in where you cut all corners to proclaim "We're the fastest!"

Not zeroing a buffer cuts a big constant out of overhead. If you know which of the benchmarks will fail if you don't zero the buffer, you code in an "exception" so the benchmark doesn't fail and other applications act wonky. This isn't the "first time" nVidia has been caught doing this, see:

http://www.geek.com/games/is-nvidia-cheating-on-benchmarks-5...

goodplay 3811 days ago

Apparently, AMD also partakes in benchmark-specific ''optimizations''[1]. Transparency is why many of us push for open source drivers.

http://www.cdrinfo.com/Sections/News/Details.aspx?NewsId=288...

pandaman 3811 days ago

Both AMD and NVidia drivers have special code paths for different applications. I don't think it's anything sinister since these are mostly fixes for the game bugs and the rest are resolutions for API ambiguities.

To give an example, consider the difference between memcpy() and memmove(). On most systems memcpy() is as same as memmove() in the sense it works even when the source and destination overlap. Then you decide to optimize memcpy and to prevent bugs like this https://bugzilla.redhat.com/show_bug.cgi?id=638477 you will need to set a flag USE_MEMMOVE_INSTEAD_MEMCPY for every app that you know to memcpy between overlapped regions. You could call this "cheating" or could be a reasonable person and say something like this https://bugzilla.redhat.com/show_bug.cgi?id=638477#c129 instead.

As for the original question. I am not an expert on the windows driver model but have written some GPU drivers and can tell that a) memory release is asynchronous i.e. you cannot reuse the memory until the GPU finishes using it and b) clearing graphics memory from CPU over the PCIe is slow and drivers, in general, do not program GPU on their own. Taking these into account, it seems the driver is not well positioned to do this and this is a task for the OS instead.

oselhn 3811 days ago

"I don't think it's anything sinister since these are mostly fixes for the game bugs and the rest are resolutions for API ambiguities." I think it is big problem. It is the same as forcing intel to change their CPU to workaround bugs in your application.

Fr0styMatt88 3811 days ago

This analogy is pretty spot on, actually. It's a result of a long process of software evolution that went awry and this is a big reason why we need new APIs like Mantle, Vulkan and DirectX 12.

See this fascinating post:

http://www.gamedev.net/topic/666419-what-are-your-opinions-o...

pandaman 3811 days ago

You mean something like this https://en.wikipedia.org/wiki/A20_line ?

qb45 3811 days ago

One could argue that address overflow above 1MB was not a bug, but a feature of the early real-mode CPUs and hence (ab)using it wasn't really a bug either.

Probably even Intel didn't anticipate protected mode with its 24 bit address bus when designing the 8086. 1MB was enough for everyone at this time.

chris_wot 3811 days ago

This probably won't happen, but it seems that games programmers are a large cause of problems for driver writers. Having to workaround bugs in games is bad for everyone.

Games Studios, IMO, should be made to fix their bugs themselves. They all have patching mechanisms these days, so it's not like it isn't impossible, or even unfeasible.

doikor 3811 days ago

Not having this much problems fixed in the API is being currently worked on with DX12 and Vulcan. The point being removing a huge bunch of the abstraction provided by dx/opengl and thus forcing the dev to write more sensible code.

Currently the engine developer in graphics programming writes something and in reality he has no way of knowing what actually happens on the hardware (the API is just too high level to able to really know much). From there it is the hardware providers job to take out their own debugging tools and make sure correct things happen by having a custom code path in the driver.

washadjeffmad 3811 days ago

It's a bit of the opposite, actually. There was a great article posted here (titled "Why I'm excited for Vulkan") where they explain how proprietary "tricks" GPU vendors use account for much of the necessity for game specific driver updates and optimizations. Game patches are to game bugs what driver updates (or "game profiles") are to what?

Lower level APIs like DX12 and Vulkan remove the competitive advantage vendor dependent performance creates, so well-coded games can perform consistently with lower overhead across ranges of hardware without having to rely on vendors to patch in the shortcuts through their drivers.

Currently, it's like filming a movie with IMAX specifications, then finding out that at different cinema chains it played with quality aberrations because their projectors didn't truly follow IMAX spec. The chains can fix it, but you're already getting blamed for the movie's issues. However, for a little money, on your next film they offer to work closely with you to ensure it shows the way you intended in their theaters. And no, they can't just tell you how to fix it-- their projection technology is a trade secret.

LoSboccacc 3811 days ago

Eh given the constraints as you spell them out it seems a clear at the release moment driven by a shader could work.

pandaman 3811 days ago

This is probably because my explanation is very brief. I don't see how a shader (a program running on the GPU) can detect that the OS has killed a process and initiate a clear.

LoSboccacc 3811 days ago

shader is a program executed by the gpu and can manipulate the memory, driver can create a fake surface out the freed memory and run the shader on it (which would avoid the need of zeroing the memory from the cpu trough the pcie)

interpol_p 3811 days ago

I don't get it, though. How can the reason be due to benchmarks / performance seeking?

The driver simply has to zero the buffer when the new OpenGL / graphics context is established. It's once per application establishing a context, not per-frame (the application is responsible for per-frame buffer clearing and the associated costs). At worst this would lengthen the amount of time a GPU-using application takes to start up and open new viewports, but that hardly seems like it would matter or even register on any benchmarks.

c0n5pir4cy 3811 days ago

The thing is it probably not once per application. I'd imagine using multiple frame buffers in an application is actually quite common and could change quite often while an application is running; especially in complex applications like games. It's probably not enough of a hit to really justify not clearing the buffer but it's enough to make it noticeable in the benchmark race.

CountSessine 3811 days ago

Across an entire computer system? For all applications? Even games? Sharing data across process boundaries is undesirable, but is something most computer users would accept if the alternative was reduced performance.

Why not just fix this in the browser? The real issue here is that this data isn't just being shared across processes but potentially with websites through malicious webgl.

Dylan16807 3811 days ago

If you can waste the time on allocating a buffer, you can waste the time on zeroing it. If you're in a hot loop you shouldn't be allocating giant chunks of memory.

CountSessine 3811 days ago

It would be interesting to know how the cost of allocating a new fbo would compare to the cost of zeroing it out. My guess is that the cost of getting into the kernel to do the allocation in the first place would dominate, but by how much would be something neat to measure.

chris_wot 3811 days ago

If it's costing a lot of time to clear a buffer, doesn't that tend to indicate that's something the video card manufacturer should design an enhancement or fix for?