Hacker News new | ask | show | jobs
by diath 89 days ago
Wayland was designed from the point of view of theoretical purists. It's basically "how would a display server work in an ideal world", unfortunately, that design turns out to also be impractical and straight up developer/user hostile.
5 comments

I would at least like to understand the idea of 'pureness' this API tries to aspire to.

It's definitely not Unix-like, since file handles and writes and epoll, and mmap for IPC are nowhere to be found. Instead you have 'objects' with these lifecycle methods that create/release resources (probably committing the design sin of having these for things which should be pure data, like descriptors).

What's with these XML headers? It's UNIX standard stuff, to have a C API for your code, that declares an API for a library, and then a makefile can just consume it. There's a standard way of supplying, finding and consuming them. Even binding generators are probably more comfortable with C headers, than this XML thing

And what's with the callbacks for everything, like screen resolution queries? In Win32, you can do it with a single synchronous API call that returns a struct that has all the info. It's not like you have to touch the disk or network to get this. In cases where you do, you usually have a call that dispatches a message to another window (which you can also dispatch yourself), and you have to listen to the response.

I did some X11 programming as part of work, and its entirely reasonable and conventional compared to this, much more like Win32 (even maybe a bit more pleasant, but I'm no expert on it).

The API sounds awful (and I've had ChatGPT generate me some example programs, and it's somehow even worse than the author describes), and not only that, the requirement of 'everything be an object', with chains and trees of objects being created introduces a huge source of bugs and bookeeping performance overhead on the application side.

Yes, you do have to do something like this with some things under Windows, but the reason for this is that these objects have duplicates in the Windows kernel.

But here it looks like this is just to satisfy the sensibilities of the designer.

Honestly this sounds like the most epic case of NIH syndrome. Like these guys wanted to write their own OS and userland and break with existing conventions.

It is a damn shame that tools like xdotool (automation) and sxhkd (global keybinds) are impossible to recreate under Wayland.
Not impossible, it just needs to be implemented at a different layer. The compositor needs to expose some API for global hotkeys. For example, I found this with ~2 minutes of Googling: https://wayland.app/protocols/hyprland-global-shortcuts-v1
> Not impossible, it just needs to be implemented at a different layer. The compositor needs to expose some API for global hotkeys.

That's a big problem. When things become an optional extension for a compositor, that means you cannot reliably deploy something that depends on it to Wayland.

At this moment, things in the wild are coupling themselves to libwayland-client and in practice ossifying its ABI as a standard no matter what the wayland orgs say about it.

xdg-shell is an optional extension for a compositor and yet you can reliably deploy things that depend on it. You're barking at the wrong tree.
OK so why wasn't this implemented in the first place? For that matter, why does our reinvented wheel have fundamental limitations?
It's not a core protocol's concern and the fact that it's being successfully implemented proves that there are no fundamental limitations there.

I'm not happy with how the collaboration and planning between various parties involved went over years and I do believe that a lot of these adoption pains are fully self-inflicted, but that has absolutely nothing to do with Wayland's technical design.

And that's a problem, now instead of knowing that something just works in the WM you're using, you have to cross-reference a matrix of features for basic tasks across different WMs because the bare minimum features are not found in the core protocols. Nothing is standardized, it's just a pile of different WMs developing their own sets of custom protocols.
like the web and we saw how that went... Oh wait!
I'm not sure what point you thought you were making. The web is a complete mess.

I can't count the number of things that either no longer work or no longer even exist at all because of nothing other than the fragmented and ever-changing nature of the web. Aside from sites and apps that required flash or silverlight, I have several different pieces of expensive aactual hardware that either partially or wholly became unusable because the built in and non-updatable web interface requires an old version of java or activex.

Indeed we do all know how that went. It went like to total dogshit.

> Not impossible, it just needs to be implemented at a different layer.

Do you mean the Window Manager layer?

That sounds like a different way of saying "impossible".

In X11 I can create an automation tool that works regardless of the underlying WM, or even if there isn't an underlying WM.

Can't do that with Wayland.

Currently there isn't really a "window manager" layer. Just like the automation / global hotkeys mentioned above, if you wanted a separate "window manager" your compositor would need to implement a protocol to expose window management. It looks like river is taking a stab at it with their river-window-management-v1 protocol: https://isaacfreund.com/blog/river-window-management/ . If they're successful we might see that protocol adopted by the other compositors.
Not only that. A11y is also quite hard. Tools that are simple to implement thanks to good a11y apis - for example on macos, the tool rcmd or homerow - are super hard to do in Wayland.
Not literally impossible. You just need to write your own composer!
I'm using Sway right now and I have key binds. Not sure why you think that's impossible.
Th point is the decoupling. sxkhd runs irrespective of wm and means your en can optionally choose not to handle key bindings at all. With Wayland you end up depending on whether or not and how your compositor supports it.
How many keybings do you have and how often do you try new window managers? Compromising the security of the whole system just to save you a few `sed`s when writing some config files seems like a bad trade off.
> Compromising the security of the whole system just to save you a few `sed`s when writing some config files seems like a bad trade off.

Those aren't the only two options. There's no need to compromise the entire system for everybody if the Wayland devs would agree to configuration that controls these things.

Then those of us who need stuff to work rgardless of WM would get stuff to work and the rest of the Wayland users can simply go with a WM that suits them.

There's no need to compromise the security of the whole system. A trivially safe option would have been to restrict the ability to acquire global keybindings to specific clients, and require the user to confirm either once or every time (or any other policy you'd prefer). An X server could do that without breaking anything.

This issue is typical of the thinking that went into Wayland: No consideration was made when Wayland was announced of the fact that there were far simpler ways of achieving the same level of security.

Imagine you wrote an application that supports global, unfocused keybinds (OBS is one popular example).

Instead of implementing it one way that works forever with any WM/DE (X11), now you must rely on each individual wayland compositor to implement one or more optional extensions correctly, and constantly deal with bug reports of people that are using unsupported or broken compositors.

Or you could write portable software that doesn't rely on reading global input. OBS you give as an example, and it is a good one. They could simply register a D-Bus handler and provide a second binary that sends messages to the running instance. The software is more general in this way as it allows full programmatic control. A Sway user, for instance, could add

  bindsym $mod+r exec obs-control toggle-recording
to their configuration. What's more, they can do this in response to other system events. A user might wish to change the recording configuration of OBS in response to an application opening, and it now becomes possible to write a script which opens the application and applies the change.

If your disdain for desktop isolation is so great, you needn't even use D-Bus. Registering a simple UNIX socket that accepts commands would work equally well in this case.

What's really desired here is a standard way for programs to expose user-facing commands to the system, which is clearly not within the scope of the specification for a display server. The problem with X11 is that it has for a long time exposed too much unrelated functionality like this to the user, and so many apps have become reliant on this and developers have neglected the creation of portable ways to achieve these objectives. A new specification for display servers that excludes this harmful behaviour is a clear long-term positive.

wdotool exists, and global hotkeys are a thing under wayland, but is desktop dependent. KDE allows it by default, Gnome can be made to do it as well with an extension.
The name is pretty similar, but looks like there is where the similarities end.
What make it worse: there are multiple implementation of composer, with small different in behaviour.

The extra security meant many automation tasks need to be done as extensions on composer level making this even worse

> how would a display server work in an ideal world

When designed by committee.

With conflicting interests.

And Veto Powers.

They looked at caniuse.com and thought "I want that!"