Hacker News new | ask | show | jobs
by overgard 1754 days ago
I can't help but think about what happened with graphics APIs - the older ones (DX9, OpenGL) held your hand more, and the new ones are super bare metal and not written for the faint of heart. I think most would agree this is better though; for an application developer you can simply use a much higher level library built on top, and for the library/engine developer you have a lot more power and control. The old APIs were in an awkward middle ground of not being particularly easy to use, but also not being nearly low level enough. Nobody was happy.

I think modern OS stacks are in a similar boat, at least Windows and macOS. True low level control is kind of a pain, but the OS libraries are not particularly pleasant or high level enough either. So largely things like Electron or Qt are acting as the defacto OS anyway to varying degrees of success but undeniable levels of resource waste. I wonder if modern languages and tooling and such are advanced enough that the OS should become more primitive and minimal. In a way it feels like we're moving there anyway with web tech becoming the defacto user space and toolkit. That's fine, but god if it wasnt all built on 50 layer of cruft our machines would be screaming fast.

Smalltalk, if you get a chance to use it, really does act like its own OS in so many ways. It's great as a programmers OS really, although it doesn't really hit the right notes for average use. Still though, if someone had the urge to build it I could easily see a smalltalk based userland built on like a linux kernel (with no gnu user space) being a totally viable thing.

7 comments

>I think modern OS stacks are in a similar boat, at least Windows and macOS.

I come from a background as a EEE mainly writing embedded code before somehow stumbling into frontend code (in win32, Cocoa and Qt) at my old job.

From my experience, OS-Level APIs definitely have more in common with Electron or Qt than they do "bare metal". To make the top left pixel red on a bare metal system or RTOS, you simply write `framebuffer[0] = 0x00ff0000`.

For win32 or Cocoa, you need to create a window, alter its flags to ensure it's borderless, change its size and posistion to 1x1 and [0,0] (or [0, screen_height] on the Mac), strip away the drop shadow, respond to regular paint events and pray like hell that nothing else draws over the top of it.

>True low level control is kind of a pain

It depends on what you're trying to achieve. If you're trying to get something highly responsive that looks exactly like the UI spec, most libraries tend to make this job harder when the abstractions leak and you're forced to rely on bizarre hacks that make no sense and feel genuinely demoralising to implement.

Dear ImGui, which exposes far more of the low-level internals than any other API, is an absolute joy to build GUIs in and significantly easier than something which "holds your hand" at every step.

Making the top left pixel red just isn't something you should be doing in a GUI environment anyway, it doesn't make any sense. For something like that you should probably be dropping to a full-screen mode like most graphics heavy games anyway, in which case it's pretty easy on modern OSes.

The desktop GUI is just one subsystem in a modern OS, and you don't have to use it if you don't want to. If you do use it then there are tradeoffs, of course, but that's always true of anything you use.

> Making the top left pixel red just isn't something you should be doing in a GUI environment anyway, it doesn't make any sense.

Never cheated in Minesweeper, I presume?

One of the versions had the high scores saved in ini file (3.1? 95?). My dad was amazed I solved large one in 15s. Then he found it. Then I learned about the read-only attribute.
Dear ImGui does not support Arabic (right-to-left, glyph shaping) and will not support Arabic (https://github.com/ocornut/imgui/issues/1343). I don't think it supports accessibility either.

As much as I struggle with Qt's APIs (aging janky Widgets, half-baked QML on the desktop, the item models are an impenetrable byzantine enterprisey API that's simultaneously too open-ended, not flexible enough since it imposes its add/remove/move schema on your model, and too buggy), building a GUI from the ground up is not going to give you internationalization or accessibility, and Qt is.

>Dear ImGui does not support Arabic (right-to-left, glyph shaping) and will not support Arabic (https://github.com/ocornut/imgui/issues/1343).

Vanilla ImGui doesn't, but I both wrapped my head around and then rewrote the text renderer in all of 2 days as a junior dev.

This version could support arbitrarily many UTF-8 codepoints with kerning and all and had a negligible performance hit on the machines I tested on. I never upstreamed it (left the company before the product that was using it shipped, actually), but it wouldn't be too hard to reproduce.

That's what I love about exposed, lower-level frameworks like Dear ImGui.

We didn't aim for accessibility, but I'm sure that would have also been semi-trivial for a dedicated dev who understands what's required.

Maybe you want to help this person who is trying to integrate harfbuzz: https://github.com/ocornut/imgui/issues/4227

The other major thing missing in relation to that is input method support.

The problem with those type of APIs and accessibility is that implementing screen reader support basically means making it retained mode in some capacity, because accessibility APIs expect you to export a tree of objects and push state updates.

>Maybe you want to help this person who is trying to integrate harfbuzz: https://github.com/ocornut/imgui/issues/4227

Looks like theirs is far more advanced than mine (mine would just generate a texture per string and cache it, the main problem it solved was being able to display non-ASCII filenames). I don't think there's much I could contribute to them at all.

>accessibility APIs expect you to export a tree of objects and push state updates

Is this an inherent limitation with the tech itself, or just an arbitrary API limitation?

This is the way these platform APIs are designed, you could theoretically make a screen reader that doesn't need that and just asks the app to recompute its state every time, but they don't currently work that way.
IMO, Dear ImGui is great for its original purpose (debugging GUIs, simple demos), but once you start adding complexity, custom layouts and custom widgets on top of it the state tracking gets to be just as bad as retained mode if not worse. Unless there was some best practice that I missed, one has to take great pains to avoid re-running the layout multiple times per frame, whereas a retained mode GUI would be able to handle that easily.
Maybe it's just a personal experience thing? I came across both ways of thinking as a junior dev, so to me retained mode was just a footgun rich environment.

I could imagine if you had decades of experience using retained-mode, though, then you'd find that way of thinking more natural.

This affects all GUIs, if you have a lot of elements then you have to be careful not to propagate complex updates through the whole tree because it kills your performance. There are various tools to deal with this, one of them is making it retained mode, but I would say basically all of those tools come with more footguns in that the GUI can easily end up in an inconsistent state. That's what you trade for performance.
I've heard that argument before, yet for almost all apps I've used in real life the opposite is true.

Game UIs are (from what I understand) written almost exclusively ImGui-style, with every single widget being fed updated state data every single frame as part of the game loop.

Yet, for whatever reason, even crazy-complex UIs like SupCom's seem to have no issue rendering in realtime with an entire 3D game running in the background, while simply resizing the window I'm typing into right now (a single textedit on a static HTML page) creates visible lag and chews up full CPU on a 2015 Macbook.

That's not to say retained-mode GUIs can't be performant (embedded systems rely on retained-mode to have reasonable performance), but on a modern system it seems like the loss in performance caused by inefficiently synchronising state in retained-mode GUIs is several orders of magnitude higher than the losses caused by pushing a couple of hundred textures every frame in an ImGui context.

The thing is those are likely to be simple GUIs. Those are a great use case for immediate mode. But try it with something that has a lot of:

- Wrapping text labels

- Widgets that are similar to CSS flexbox where the size of elements depends on other elements

- List/tree widgets or text editor widgets with large datasets in them

And then you will notice performance tanks. Think of trying to build a text editor that way, if you opened a 5MB text file, you would have to rescan the whole 5MB and compute reflow for it every time.

Game UIs do not generally involve user interaction based on the position of a pointer within the 2D screen geometry. They generally have a very limited event/input system that only takes effect via an overlay. Objects do not come and go from the "model" as a matter of course. They certainly do a LOT of drawing, and they do it very fast, but the challenges of UI design (particularly toolkit design) don't really reside in "how do we draw this stuff really fast".

I do often reflect on how fast game UIs draw, and compare that to the headaches we have in the UI of a digital audio workstation. It generally seems almost inconceivable how we could be so slow in comparison. But then I think about all the ways the user has to interact with the actual UI (not the backend data model), and it starts to seem like a different sort of programming model entirely.

I play games very infrequently, but my anecdata is the opposite: games usually have laggy UI with an inconsistent reaction to input events, noticeable latency, frequent breakage and generally worse user experience (and of course mostly without the ability to resize anything, to compare with browser), even when compared to browsers and Electron apps.
And would you want your laptop taking as much power writing a word document as it takes to run a AAA game?
I love the immediate mode style of developing, but ImGUI specifically, at least without third party patches and extensions, has very awkward and limited layout (for example, I'm still waiting for this to get merged[1], which would make it easier), its styling support is relatively minimal (but it can be made look pretty nice) and its accessibility support is non-existent. I also find the API slightly clunky, although its not bad once you get used to it.

Overall, I really like Dear ImGUI and I love how easy it is to expose data to a UI through it. Its much more pleasant to use than a traditional retained mode widgets API. But Dear ImGUI isn't a good general purpose UI (nor was it designed to be one), although I could see the immediate mode API work well for one. I started making a complex editor for a toy engine in it and while it worked ok, the reasons above made me cut it back to just some basic in-engine stuff and the rest is exposed over an optional embedded web server module and a react app as the editor. Its only a toy, but you would use the in-engine ImGUI editor to see the scene rendered in-engine, move objects around etc, but you would use the react app to edit object properties, add assets and behaviours and whatnot. It made it much easier to create rich editors for things like behavior trees, node graphs, etc. Although since its just a toy, I didn't implement a lot of this stuff yet and haven't had time lately to get back to it.

[1] https://github.com/ocornut/imgui/pull/846

Isn't this obvious? I mean if you need that kind of control of the screen then you shouldn't be working inside a WM by definition, and the Toolkits are built for building interfaces in the context of WMs. Right? I definitely don't want your random application to be able to draw on arbitrary parts of the screen and then prevent anyone else from drawing over it. Your use-case might sound simple at first glance, but it's truly odd if you think about how I interact with applications, and how I expect to be able to manage them.
Of course. I'm not saying that using Win32 or Cocoa are wrong (at least for a Desktop OS where you don't want to hand over full control to an app), just that they're closer on the spectrum to Qt than bare metal.
Arguably win32 GUI or Cocoa are not "OS level" APIs; yes some aspects of them are invoked through syscall type interfaces (esp win32) but not consistently and the operating system itself is entirely usable without them. Especially Cocoa. You can boot a Mac into a BSD shell without ever running Finder and the GUI.

Writing into a framebuffer is arguably as much of an abstraction these days as many of these toolkits. In the modern world of GPUs and display controllers a framebuffer itself is already N levels away from the "physical" display hardware. We're not in VGA land anymore. Some aspects of the OS or the OS's interaction with the display driver may present something like a linear framebuffer to you, but underneath that is a world of GPU textures and buffers and display controller abstractions.

When I think of OS-level APIs I think of ioctl, mmap, file descriptors and sockets. I guess that's my Linux bias showing?

Well, that's kind of what I'm driving at with things like the win32 api being a bit too high level and still kind of sucking. An example would be the GDI (the way you draw simple graphics). Like, it's in the worst of both worlds in that it's both slow and annoying to use. Or you can use DirectX and draw things fast, but you need to know a lot more. I think my point is mainly that OS's might be better off just giving you things like DirectX (and mmap and ioctl, etc.) and leaving off things like the GDI.

I know right now Microsoft keeps making like 50000 new UI frameworks and part of me is like, just knock it the fuck off, give me some style guidelines and some super-efficient very low level no-frills API's and let me just build apps off a library like Qt that abstracts it.

Unless you plan to write an OS personality yourself, there is no way to use Windows without Win32.
But most of the time you don't want to make individual pixels red. You want to blit bitmaps and fill rectangles, preferably without pegging the CPU. Here the simple memory-mapped framebuffer is less helpful, especially if you have specialized hardware to do such work.

Of course this does not apply if all you have is a screen connected via SPI to an MCU's GPIO pins. But you won't expect much graphical sophistication from such a device anyway.

Just start to implement the ~300-400 functions listed here and the underlying systems supporting them. How long could that take, a whole weekend?

https://github.com/torvalds/linux/blob/master/arch/x86/entry...

> To make the top left pixel red on a bare metal system or RTOS, you simply write `framebuffer[0] = 0x00ff0000`.

Really? In my brief experience it was more like sending a bunch of commands over SPI/I2C/UART to some LCD module. Which isn't thaaaat different from how an OS driver communicates with a GPU, only the commands are far, far more complex, so using an OS-level API avoids having to do it by hand.

The OS is all about resource sharing and allow a minimum common level of interoperability.

Smalltalk can provide a storage abstraction, but to interoperate with other things it needs to be able to save data in a way that can be loaded back into your Java program.

OSes can provide you with the desired abstraction level.

You could stop at the block layer, which many DBs can take advantage of, but that won't help your Java and Smalltalk program work together easily. A filesystem is a common abstraction which should provided by someone.

Your example about the graphics stack is no different, the problem is that various components of the system are not "shared" in the same way all the time. The actual low-level abstraction of the GPU being provided to you by the kernel has more complexity today than what it had years ago in order to allow virtualization and compartmentalization of graphics memory. This wasn't needed some years ago, as it was assumed X11 was the only program to ever going to access the framebuffer, and would allow any single program using GL to trash the entire GPU.

What has been simplified is the user-level facing code, so that you can more precisely control allocation of the resources. Akin to the page-level memory allocation support we always had. This was done to allow /some/ programs (3d engines) to push the hardware to the limits (I'm relatively sad the GL1.x is "deprecated" - I still consider it to be an excellent API for prototyping and what would be 90%+ of applications).

Consider audio instead. OSS always provided pretty much all the hardware features, but allow no resource sharing. Every single evolution on that front has been an increase in complexity to allow for resource sharing. And it's interesting because the effort went in and out of the kernel a couple of times. Describing the evolution of OSS/ALSA in linux/freebsd would be too long, however we now have very fat daemons sitting in front of the HW just for resource sharing.

> Consider audio instead. OSS always provided pretty much all the hardware features, but allow no resource sharing. Every single evolution on that front has been an increase in complexity to allow for resource sharing.

This is not true. Early evolution of ALSA included lots of work on handling high channel count, memory-mapped devices (and their opposite, a few devices around in about 2000 that require active CPU involvement in data transfer). At some point, more evolution was required to deal with asynchronous interfaces like the USB ones that are now ubiquitous. The Intel HDA "specification" created a huge amount of work to model the topologically indistinct hardware mixers that it allowed. It remains the case that ALSA provides no facilities for resource sharing that are used by more than a handful of obstinate folk, so there has been essentially no evolution within ALSA itself toward that goal (see next para).

> And it's interesting because the effort went in and out of the kernel a couple of times.

Once OSS was dropped, the only resource sharing left on the kernel side in ALSA was the dmix layer. That continues to exist today, but has never worked reliably, which is part of the reason why PulseAudio came into existence.

> Describing the evolution of OSS/ALSA in linux/freebsd would be too long, however we now have very fat daemons sitting in front of the HW just for resource sharing.

I object to JACK being called "a very fat daemon". It's lean and mean and fairly clean (even the version I didn't write). Perhaps you mean PulseAudio, but that does much, much more than just "resource sharing".

As a side-note, Apple did move resource sharing out of the kernel. At some point that was all kernel-side, and at some point, coreaudiod showed up as a user-space daemon that was eventually clearly doing the same thing.

> Consider audio instead. OSS always provided pretty much all the hardware features, but allow no resource sharing. Every single evolution on that front has been an increase in complexity to allow for resource sharing.

None of the API changes were required for resource sharing though so that complexity could (and should) have been contained to the implementation and those applications that actually need new features.

Which is actually the case. In fact, audio is the best examples where everything can plug into almost everything. padsp (if I'm not mistaken - not at the prompt right now) does just that.
That's completely incorrect. PulseAudio provides no inter-application audio routing at all (neither did OSS and neither does ALSA).
padsp is a hack that intercepts the open() calls to /dev/dsp - if things were done better opening /dev/dsp would just work without any wrapper, even with statically linked applications.
Nah, Android, macOS and Windows are doing just fine, it is the classical UNIX desktop that cannot get their act together.

The existing issues aren't technical, rather political, like the WinRT crusade that ended up bombing and now we have plenty of GUI toolkits to chose from on Windows.

Swift, Java, Kotlin and the .NET languages are more than high level enough.

MS .NET burned me hard with WPF. After finally looking into it, it more or less was cancelled for greener pastures.

I don't think the newer Windows UI-Frameworks are loved that much and UWP is not really convincing for desktop applications. Just too much hassle for too little gain and the threat of vendor lock-in.

“I don't think the newer Windows UI-Frameworks are loved that much and UWP is not really convincing for desktop applications. Just too much hassle for too little gain and the threat of vendor lock-in.”

I used to be a desktop dev but after the mess with Winforms, WinRT, WPF, Silverlight, UWP and WinUI I would never bet again on an MS GUI framework. Most likely it will be abandoned soon and put into maintenance mode.

I still don’t understand why they constantly crank out new frameworks that do basically the same instead of evolving an existing one like WPF. The cost of enveloping, testing and documenting these frameworks must be enormous.

I briefly looked into UWP when it was being promoted with such fanfare, and the only thing I could see were the gross incompatibilities with WPF, basically requiring you to maintain two guis if you wanted to use the app store and support two versions of Windows, let alone Xamarin. Then they basically dumped the app store model they were going to use and started allowing anything on there. It's https://xkcd.com/927/ all over again, except in this case it's all just Microsoft!
I looked at UWP when it came out. The only thing I found was that it was different but I didn’t see any improvements over WPF. I still don’t understand why they wouldn’t make it compatible with WPF.
Years and years ago I was offered a job as an MS tech evangelist for the purpose of pushing Mozilla to use "new" MS technologies coming in what would eventually be Vista. I had a decent repuation in the Mozilla community, and I've always been a good communicator. The problem was some of the things they wanted me to push I knew would never get out the door.

There were three big issues although I can only remember two, which were WPF and WinFS. I said flat out WinFS isn't going to make the cut, the third feature wouldn't, and even if WPF did make it into the final build, Mozilla doesn't have the resources or desire to move Mozilla to WPF, especially since mozilla is a cross platform app. "Yes, we know, but utilizing WPF will make it easier to develop on Windows!" I pointed out while that's good for MS, it provides no benefit to Mozilla's non-windows users, and Mozilla would never do it. And WinFS isn't working at all, so why would they spend even a moment trying to figure out how the new FS would benefit an application?

I think the third leg was Palladium. "It'll make online banking so secure!" I remember commenting that with all the flack it was getting, it'll have no buy in from anyone else, and flop.

I don't think compulsive "tell it how it is" people are good for evangelist roles. :D

It is still actively developed, even if it doesn't get the same WinUI love.

https://github.com/dotnet/wpf

But yeah, having ramped down the team while beting the farm into WinRT wasn't the best idea.

Still, just like VB 6 in Windows 10, WPF will be around for decades to come and it isn't like there are revolutionary UI concepts to implement.

Could you expand on this more? I don't know a lot about GUI programming, but I'd be interested in learning what the current failings are within the desktop space.
Specifically for Linux, there is no ubiquitous standard with decent performance and a predictable look such as win32 or Cocoa.

Instead, developers rely on toolkits such as Qt or GTK to ensure compatibility, which are usually either bloated or ugly.

That's true for "Linux", but "Linux" on its own is not a desktop OS. If you target KDE then Qt is the ubiquitous standard with decent performance and a predictable look, same goes for GNOME with GTK. Both of them I would say are less "bloated" than win32 or cocoa at this point. And I don't have any comment on whether it's "ugly" or not :)
If you think that Qt is more bloated than Cocoa I don't know what to say. That objc runtime is so heavy and slow, good luck making it run on microcontrollers
Not sure what you’re talking about. ObjC runtime is fairly small. The basics of what you need for ObjC is just objc_msgSend, which is basically a fancy hash lookup written in assembly language. If you want ObjC on a microcontroller, you’d want to port this function to your target architecture. There are a few other components you “need” but objc_msgSend is the key one.

You’d probably also want some form of malloc(), but that’s completely optional. There’s nothing in Objective C that says you have to allocate memory dynamically, or that you have to do it with malloc.

ObjC runtime has grown somewhat to include more features, but you don’t need all those runtime features if you want to run your code on a microcontroller. Just like you don’t need glibc if you want to run C. There is more than one runtime for Objective C you can choose, just like there is more than one runtime for C.

It's probably exactly that runtime...desk devs have forgot that embedded is tiny, megs not gigs of memory, so not one but 2-3 orders of magnitude smaller.

You can have all the 'simple' calls you like but if you need to malloc half a gig of ram just to get started...that is heavy and bloated.

Also you say 'architecture' but unless you are talking a battery hungry cellphone you are probably talking 16 or 32 bit proc not 64, which means potentially massive increase in the size of code to be generated since you lose certain instruction sets. I don't know realistically what Obj C uses in the osx cpu architecture but I'm almost certain it's not going to be as simple as just retargeting your compiler...

Not saying it can't be done but take even the new arm laptops from Apple - that's a significant hardware investment on top of a bunch of software tricks, not just a casual retargetting that can be portably moved to other low power systems.

I don't understand the technical reasons behind it, but for whatever reason doing something like resizing a Window for any non-trivial GUI seems to cause the repaints to drop down to 15-20Hz in Qt and spike the CPU usage massively. The same issue doesn't occur with Win32 or Cocoa.
You probably don't have graphics drivers installed...

IIRC the fancier toolkits (Qt, GTK) do widget accell with either mandatory blitting and/or 3d (OpenGL/DirectX) (Other frameworks like WPF and Cocoa are notorious for this too).

There are, or used to be, frameworks that were not gpu dependent.

I've never hear nor seen these massive drops you claim to, so either you are running 20 year old hardware or your graphics aren't on, or you have some other hardware issue...

That's interesting, I have the exact opposite experience, I used to have a MBP dual booted with ArchLinux, and resizing windows was incredibly much smoother on Linux with either Dolphin, Nautilus or Thunar, than with Finder on the Mac which looked like it only drawed 1/5th of the frames compared to linux.
I do understand the technical reasons behind it, and it has nothing to do with bloat. It mostly has to do with mapping a cross-platform (i.e. generic) window abstraction onto a specific OS-provided drawing/event/windowing API.
for "microcontroller class" whats the perofrmance level your looking at, something akin to a 68030 or so?
> the classical UNIX desktop

CDE? That hasn't existed for decades.

If you actually look at the commits, it's mostly fixing the build and tweaking it to work on modern platforms. My personal opinion after trying it, I would not suggest use of Motif or CDE for anything besides nostalgia, it has some serious usability issues. But if you enjoy it, more power to you.
Who cares about CDE, the classical UNIX desktop, as in the way that BSDs and GNU/Linux keep pushing for the old days with their fragmented stacks.

macOS, although a UNIX, follows the same ideolagy as NeXTSTEP, where UNIX compatibility was a means to bring software into the platform, and that was about it, GUI software was to be fully taken advantage of Objective-C Frameworks.

My point is that "classical UNIX desktop" doesn't exist.

(Hell, even UNIX itself doesn't exist anymore. Linux and FreeBSD is its own thing now.)

It sure does exist, a large majority of Linux and FreeBSD users pretend they are still living in the 80's with vt100 and an improved twm.

As proven by https://news.ycombinator.com/item?id=28437173 making to the first page today.

Imagine the BLING if you combined that with this:

[X] http://jdebp.info/Softwares/nosh/user-vt-screenshots.html

> with vt100 and an improved twm

That's in no way a "classic Unix desktop". This is hipster nostalgia, like those indie "retro" pixel art videogames.

How do you find time for work? I see your comments shitting on everything that's not Windows pretty much everywhere.
>True low level control is kind of a pain, but the OS libraries are not particularly pleasant or high level enough either

Wait. In that case, library developers or OSS projects can fill the gap. What am I missing here? QT comes to mind. A high level library for GUI programming. Also, the modern .NET stack.

I always thought the popularity of Electron is because GUI programming is insanely easy on it, and if you have apps that are heavily data dependent, you go with it because the alternatives (Native apps, WPF or Winform apps) do not provide any value addition.

Electron is popular because it makes cross-platform desktop apps easier, or because developers have existing JavaScript code that they want to share between a desktop app and a web app.

I would say that if you just wanted to spit out a Windows application, and you already knew the technologies, WPF or Winforms would probably be better (despite the fact that WinUI is the new hot stuff).

> So largely things like Electron or Qt are acting as the defacto OS anyway to varying degrees of success

That's the natural (and arguable optimal) state of things: lower levels and higher levels.

High-level developers use high-level things, low-level developers use low-level things.

Every once in awhile someone gets the bright idea to combine all the layers into one, but finds out that in the real world that just isn't practical.

But we get developers pulling apart and adding layers over and over, driven by a love for "simplicity." Moving forward, or in circles? Hard to say, but probably forward.

You might find the Exokernel approach interesting.

https://en.wikipedia.org/wiki/Exokernel

very interesting take, thanks for sharing