Hacker News new | ask | show | jobs
by invokestatic 2613 days ago
I've written real-time game UIs before so I think I have some relevant experience here.

1. It is very possible to write a retained-mode GUI in a graphics API like DirectX or OpenGL. In fact, a retained GUI would typically wipe immediate GUIs in terms of performance in this context. In immediate mode, the GUI's vertex buffers need to be completely reconstructed from scratch every single frame, which is slow, CPU bound, and cannot be (easily) parallelized. It's like reconstructing the game world every frame -- that would be ludicrous for any non-trivial game.

2. I don't think there would be that much of a difference between the two UI models, since data updates can be dispatched from the event loop. It would be faster, too, because only UI components that need updating could be redrawn. This is far faster than updating the entire UI every single frame.

3. As mentioned earlier, immediate mode GUIs are going to be a lot slower than retained mode, when implemented properly. Immediate mode GUIs put most of the work on the CPU instead of offloading most of the work to the GPU like in the retained model.

I think developers that are using immediate mode GUIs are doing so because of their ease of use. I think retained mode is typically harder for a game developer to conceptualize because immediate mode is conceptually similar to a game loop. Also, I don't know of any free & open source retained mode GUIs for DirectX and OpenGL and the like.

Also, DirectX at least (and probably OpenGL) encourages a retained-like model for general rendering. The only way to get decent performance is to re-use vertex buffers between frames and only update them when something changes.

2 comments

I've written real-time game UIs too. I think you underestimate just how ludicrously fast CPUs and GPUs are, and overestimate the complexity of your average GUI. What does your average screen's-worth of GUI consist of, after all? How many widgets are there? I double dare you to tell me that a modern computer or games console can't handle 500 widgets per frame. And I now triple dare you to tell me that your UI designer has put that many damn widgets on one stupid screen in the first place.

(Every UI I worked on actually did redraw everything every frame anyway. It's really not a big deal. Your average GPU can draw a monstrous amount of stuff, something quite ridiculous, and any sensible game UI that's actually usable will struggle to get anywhere near that limit.)

I'm sure many games can get away very well with an immediate mode GUI. I think the question is not can you, but rather should you. My last project used a custom immediate-mode GUI. At the absolute pinnacle of optimization, it was pushing 2,000+ FPS on my machine with something like 3-4k vertices, with heavy texture mapping and anti-aliasing. But the problem was that even with peak optimization, the CPU was spending 15-20% of its time every frame recreating the UI's vertex buffer. Now imagine if we had done a retained-mode GUI instead. That 15-20% overhead would be reduced to near 0% on a typical frame. For nearly any type of game, that kind of savings is really significant. Think of how many more vertices your artists can add, or cool gameplay elements you can add that you didn't have the CPU time available before, and how much better it will run on lower-end hardware.

Why settle for "good-enough" performance?

Performance is not the only consideration, even in most demanding games. IM GUIs simplify UI code a lot, by not having to rely on message systems that make the code harder to follow and less predictable.

As most things in life, it's a matter of trade-offs.

> The CPU was spending 15-20% of its time every frame recreating the UI's vertex buffer.

Not saying it is easy, but it's possible to optimize and cache vertex buffers by using something similar to React's VDOM.

Doesn't your cache just become a limited retained mode with a somewhat hacky, opaque API?
Basically yes. If an immediate mode API is much easier to use, and a retained mode underlying implementation has much better performance, then putting a React-style VDOM layer in-between could get the best of both worlds, depending on how well the middle layer is implemented.
>Why settle for "good-enough" performance?

Because its good enough? Perfection is the enemy good.

I can't see 2000 FPS, 60 FPS is good enough for me. Similarly the 20% vertex buffer hit might not matter.

The catch is that games follow the law of "MOAR!": more details, more effects, more postprocessing, mpre abimation, more simulation... this means that the engine will almost always be pushed to max out the hardware long before the designers are happy with what they have.

Heck, I can think of pretty easy ways to improve sound and video quality of games in ways that are quite fundamental, but that would easily max out the best gaming rigs with either sound or graphics alone.

Bottom line: the need to optimize always comes sooner than you would expect.

Perhaps you're right. As you can probably tell from my prior comment, I don't really subscribe to the 'MOAR' philosophy.
I don't understand the problem. 2000Hz = 0.5ms/frame; 20% of this = 0.1ms. That sounds like a great result. Your target frame rate is presumably 60-100Hz, assuming it's a PC game, meaning your frame budget is 10-16ms. If your UI takes 0.1ms, you've got >99% of your budget left.

(Also: 3-4K vertices for UI was about what you could expect to budget for a PS2 or Xbox game! - max throughput for PS2 was something like 250,000 vertices/frame at 60Hz, and this is <2% of that. I struggle to believe this is any kind of an issue for anything modern.)

I should mention that 2000 Hz when only running the UI -- with the full game running at 150 FPS. So really the CPU time is 6.66ms*0.2 = 1.33 ms just for the UI!

Of course, that's on my beefy machine with a octo-core overclocked CPU and a couple 1080Tis. But what about the players who are trying to play on their mobile CPUs with integrated graphics? That margin could be the difference between 45 and 60 FPS.

This is wrong, just because it uses 20% CPU when rendering at 2000Hz doesn't mean it will consume 20% CPU when running the game too. Running the game alongside the UI is what makes it drop to 150FPS from 2000FPS. So if the UI consumes the 20% of the frame time at 2000FPS, that is indeed 0.1ms and this number will not magically increase when you add the game. So when you're running at 150FPS your frame time is 6.66ms and you're using 0.1 for the UI. That seems pretty good for me to be honest.
I didn't say that it's not possible to solve these problems in a retained GUI, just that existing ones, QT, Win32, WPF have these problems.

And since writing a full retained GUI is not exactly trivial, people just wrote mini-GUIs using immediate mode.

When I talked about retained mode APIs, I was thinking about scene graphs, where you say "addMesh" or "addSphere" and then just call "renderFrame". I'm aware that most game engines implement their own scene graph anyway, but it's game specific, not some generic one provided by the OpenGL/DX API.

Sorry if I misinterpreted your post but I don't think that changes my response much.

You can absolutely use Win32 Forms and WPF (and probably Qt) with a game. They can be overlaid on top of a DirectX or OpenGL window (with a transparent background) -- I've done it before! I don't think it would have any of the downsides you mentioned, either, except that it wouldn't be GPU accelerated or actually rendered inside the graphics context, which is why nobody actually does this in practice.

But that ignores the dozens of GUI middleware specifically designed for games. A cursory Google search reveals that most of these are going to be retained mode. There's a reason for that. Projects like ImGUI appeal to mostly indie devs who don't have the time or resources to write their own GUI library or license some third-party middleware. And it's probably going to be just fine for their use case. But it's definitely not a perfect solution and we definitely shouldn't throw away decades worth of knowledge and experience like the article is implying.

Qt's QML uses OpenGL and can at least render into an FBO in your existing OpenGL context more or less out of the box. Integrating it with a game render loop deeper should be possible too, but more effort.
Qt is moving into a 3D API agnostic backend.
Sure, for Qt 6, so at some not-yet known point in the next few years (I doubt they'll get that done next year). Whereas both the FBO method or a custom renderer are things you can do today.
With Scaleform being one of the most common ones.
Which GUI middleware libraries do you like ?