Hacker News new | ask | show | jobs
by moshegramovsky 804 days ago
I write C++ for high-performance Windows desktop applications that are used on a wide variety of form factors. This means that I still optimize a lot of things, such as what happens when a user edits a property in an edit box. How can that edit be minimized? How do I make sure that commands operate in less than a second? How can we hide latency when a long execution time can't be avoided? 99% of the time, optimizations are about doing less, not doing something faster or with lower-level code. You'll never write faster code than code that doesn't run.

I think the GPU would do a lot more work in most applications than it does today. If a process needs to be super fast, when applicable, I write a compute shader. I've written ridiculous compute shaders that do ridiculous things. They are stupidly fast. One time I reduced something from a 15 minute execution time to running hundreds of times per second. And I didn't even do that good of a job with the shader code.

4 comments

Sidenote: MSVC had an optimization option for Windows 98, I think it was /OPT:98 (/FILEALIGN:4096). It sets the "file alignment" of sections in the portable executable (PE) file to 4096 bytes (0x1000) = 1 memory page -> noticeably more efficient because it's a direct copy operation. It's sacrificing space (padding empty space with thousands of zeros) for time (one copy operation instead of many to shuffle data into memory pages).

(The file alignment still defaults to a 512-byte (0x200) sector size which means the inefficiency is there today even though you may not notice it in isolation, but the "sector"/buffer size has been at least 4096 bytes since 2011. [2])

> The /FILEALIGN option can be used to make disk utilization more efficient, or to make page loads from disk faster. [Assuming it matches the page size = 4096 bytes.] [1]

> All hard drive manufacturers committed to shipping new hard drive platforms for desktop and notebook products with the Advanced Format sector formatting [4096-byte or greater] by January 2011. [2]

[1] https://learn.microsoft.com/en-us/cpp/build/reference/fileal...

[2] https://en.wikipedia.org/wiki/Advanced_Format

People like you remind me that I'm still an amateur at all this :)
Tangential but funny story from some years ago, did the same on a virtual reality app (Qt, Oculus SDK) so talking multi-threaded renderer, tons of background activity, even spawned a mini helper server to process tasks and such and did custom hacking (registry, window flags) to override windows features to make the app snappy. Distinctly remember spending weeks on startup time to get the app to consistently drop the user into a session between 250 - 500ms even from cold launch which involved something like a mini page file to capture state and other things only for my boss at the time to come and say the app was "too fast", users couldn't see the splash screen so we added a random(1.f, 3.f) second sleep...
LOL, you just can't make all the people happy all of the time, right? I have done similar things with timers, for the same reasons.
> One time I reduced something from a 15 minute execution time to running hundreds of times per second

That's too good a story not to have just a little more detail. Are you willing to share more?

Sure. It was a fairly complicated image processing algorithm, but not necessarily something that you would want to go through a lot of trouble to implement on the GPU. At least not until you're desperate. And I should add, the results are pretty boring. It doesn't even generate anything interesting.

I read the paper that described the algorithm and implemented code on the CPU, thinking, quite stupidly, that it would be fast enough. Not fast, but fast enough. Nope. Performance was utterly horrible on my tiny 128x128 pixel test case. The hoped-for use cases, data sets of 4096x4096 or 10000x10000 were hopeless.

Performance was bad for a few key reasons: the original data was floating point, and it went through several complicated transformations before being quantized to RGBA. The transforms meant that the loops were like two lines total, with an ~800 line inner loop, plus quantization of course (which could not be done until you had the final results). In GLSL there are functions to do all the transformations, and most of them are hyper-optimized, or even have dedicated silicon in many cases. FMA, for example.

So I wrote some infra to make it possible to use a compute shader to do it. And I use the term 'infra' quite loosely. I configured our application to link to OpenGL and then added support for compute shaders. After a few days of pure hell, I was able to upload a texture, modify the memory with a compute shader, and then download the result. The whole notion of configuring workgroups and local groups was like having my pants set on fire. Especially for someone who had never worked on a GPU before. But OpenGL, it's just a simple C API, right? What could go wrong? There's all these helpful enumerations so the functions will be easy to call. And pixel formats, I know what those are. Color formats? Oh this won't be hard.

But once everything was working, it only took a few more days to make the compute shader work. The hardest part was reconfiguring my brain to stop thinking about the algorithm in terms of traversing the image in a double nested for loop - which is what you would do on the CPU. Actually, the first time I wrote it, that's what I did, in the shader. Yes, I actually did that. And it wasn't fast all. Oh man, it felt like I was fucked.

But in the end, it could process the 4096x4096 use case at 75 FPS, and even better, when I learned about array textures, I found that it could do even more work in parallel. That's how I got it from 15 minutes to hundreds of frames per second.

Do you happen to have any pointers or recommendations regarding C++ for desktop applications? Especially towards state-management and user-interaction?

I am primarily doing game development and HPC; I am decently familiar with C++, but desktop UI has been a pain point for me so far. Most GUI tools I write in C++ are using ImGui, or they are written in C#.

Desktop UI is painful. It doesn't help that Microsoft is seems to have quite a few competing UI frameworks and technologies these days.

1. What is your goal? Do you need to run on Windows and Linux? QT isn't bad, although I personally think the UI looks a little weird. It is definitely highly opinionated and parts of it are quite strange IMHO. There's probably lots of jobs writing with QT, which might be a nice side bonus from learning the framework.

2. Do you need a totally custom UI? If so, I would stay with ImGui. You might find Windows UI development extremely frustrating, especially that you have to owner draw a lot of stuff to get a really custom UI. That can be an extremely difficult and terrible experience, and I don't recommend it to anyone who isn't already an expert at it.

3. State management? You mean like the state of the UI? Is a button pressed? Could you be more specific?

4. User interaction? This is such a broad area. Could you be more specific? Like filtering mouse and keyboard messages? Windows has several APIs for this.

EDITED TO ADD: In my experience, which is significant, either use a GUI framework and operate within its capabilities, or draw everything yourself. In Windows, your life will become exceedingly difficult if you use a framework when you want to do a lot of custom components, or if you want a lot of custom look/feel. If it were me, I would draw everything myself. People don't need the consistency of the Windows UI anymore, provided you stick with common and well-known metaphors like text boxes and property editors, etc.

> I would draw everything myself.

I wouldn't be too fast to recommend this. I have quite a lot of experience with Qt[1], and I manged to get a good look and feel across different operating systems. Yes, you'll need to customize Qt Quick components yourself. But that's easy. Also, Qt is improving its support for native components, they now support native dialogs and file pickers in Qt Quick[2]. Another important thing, is that you can always extend your app using open source libraries - for example - qwindowkit allows you to create native frameless windows.[3]

I highly recommend Qt. And related to this post, you can write some extremely responsive and fast applications with it.

[1] https://www.get-plume.com/

[2] https://doc.qt.io/qt-6/qtlabsplatform-index.html

[3] https://github.com/stdware/qwindowkit

You're right. QT is a good choice if you are willing to work within the bounds of the framework. QT is definitely not a good choice if you want to make a lot of customizations. If you want to make an app like Spotify, don't use QT.
Spotify just uses electron, if you want your app to look like an electron app, use that. I think it's hideous.
Why exactly? With QML it’s incredibly easy to creat custom, animated and complex UIs.

Also, if I remember correctly there was a time when Spotify was written in Qt.

When you write your own UI, you get used to quickly + easily being able to create custom elements that do/behave exactly as you want and even iterate on those elements to get the best user experience.

Lets say I want to control pan/tilt/zoom/focus/aperture/etc of a remote camera. If I ask lets say an expert in UI framework Z to do it, it will take them 10x longer to create a very painful experience using standard elements with poor input latency, so someone actually trying to setup a camera over/under shoots everything, but it technically "ticks every box". The path to create a better experience just isn't really there and it is difficult to undo/change all the boilerplate/structure, so version 1 isn't improved for years because it took so long to create the first iteration.

Is there a way to get a Win95 look in Win10 QT?
I'm building a complex greenfield app in WPF, so your "Desktop UI is painful" comment does not resonate with me at all. They will have to drag me back to web development kicking and screaming. I absolutely love building UIs in this framework. No blockers, no bullshit. So fluid and easy.

Not to mention, the exact same paradigm translates to the other Microsoft desktop/mobile/x-platform frameworks, so if you insist that WPF is "old" or out of date, everything you build can be ported/refactored quite easily to the newest framework(s).

I have built non-trivial desktop apps in every framework except QT, and you would have to pry WPF from my cold dead hands.

I also use WPF. Like you, I love it. But it still can be painful if you're doing non-trivial things. Sometimes desktop UI is painful. Sometimes it requires a lot of work to deliver a perfect user experience.
The primary goal is building tools that other (typically less tech-savvy) people can use to create various types of content (often video game related) and to semi-automate repetitive tasks. An example here would be our texture selection / marking tool [1]. As a more advanced example, think of an editor found in most modern game engines, like Flax Engine [2].

Windows is the primary target for these tools, but I'd really like them to be also available on Linux to lessen our Windows dependency. I've used Qt in the past, before they introduced Qt Quick. I also heard about complicated licensing changes when they moved to Qt6, which made a lot of KDE devs worry. And stuff like not being able to download Qt without an account; or the framework coming with everything and the kitchen think nowadays, where I am only interested in desktop UI; no networking, no JavaScript-like scripting language, etc.

I don't want to build a complete UI system from the ground up, but there are certain points where I'd like to be able to customize things, like adding new widgets and having some way to render 2D things without needing a graphics API surface -- think HTML canvas. I feel like ImGui does a pretty good job here, giving you drawing primitives.

For state management I am mostly concerned with the life-time, ownership, and connections between objects. Where other languages, like C#, don't really have to worry about this due to garbage collection, in C++ you typically want things to be more strictly organized. I'd prefer a UI framework to facility object life-time management in a streamlined manner. Like, if it opts to use shared_ptr for everything, that's fine, but it also needs to prevent me from accidentally building cycles and provide a way to dump the dependence graph so I can see directly why a certain object is retained (and by whom).

To clarify the difference between C# and C++ here, think about how the implementation of an observer pattern is vastly more complicated in C++ to be safe as object life-time is not managed automatically for you. Copy & move semantics only adds to this in terms of complexity.

State management and user interaction are closely related here, as almost all user interaction results in state modification. Looking at HTML/JS frameworks, some leverage a 2-way data binding approach, where others bind data only 1-way and use events for the other way. In immediate mode GUIs I am updating the underlying state directly -- practically having the view and model tightly coupled. Here I'd like for a framework to be explicit about what is happening, without being too cumbersome to extend a UI with new functionality. E.g. I don't like signals that can be used across the whole code-base, where suddenly a function executes and you have no idea what originally triggered it. On the other hand, having to handle and forward every basic event from one component to its parent isn't an option either. If that makes any sense.

[1] https://github.com/ph3at/image_tool [2] https://flaxengine.com/features/editor/

> I don't want to build a complete UI system from the ground up, but there are certain points where I'd like to be able to customize things, like adding new widgets and having some way to render 2D things without needing a graphics API surface -- think HTML canvas. I feel like ImGui does a pretty good job here, giving you drawing primitives.

Yes, Qt may not be super friendly with this. However, it is perfectly possible. Qt lets you integrate an "external canvas" that you can render with your favourite graphics API (e.g. OpenGL) and integrate it in the Qt Quick scene (or widgets if you prefer). For example, I did this with my notetataking application, Scrivano [1], for handwriting, where the main canvas is a separate OpenGL view that renders content using Skia, while the rest of the UI is standard Qt Quick.

[1] https://scrivanolabs.github.io