Wayland is a critical technology for Linux desktop community.
X11 has been a reliable workhorse, but its time is up - simply too much cruft accumulated over the years that's not even used anymore. Yet all of it needs to be continually supported, adding to complexity. No one uses X11 primitives for drawing apart from bitmap functionality - even repainting dirty regions (expose events) often involves sending over a new bitmap and using X11 to draw it. This is very inefficient.
To implement a reasonably fast GUI, X11 has essentially resorted to hacks (extensions). DRI2 (+GLX) is probably the most important of those. AFAIK, it's what almost everything uses for drawing, and does not work over network at all. Yes, modern X11 is local only. If you're on a network, it's back sending those uncompressed bitmaps. Even with all these hacks, X11+DRI2 can't even maintain tearing free display. Well, at least DRI3 should fix tearing...
So if none of modern software needs nothing but a bitmap surface to draw on, why implement and maintain anything else?
Which leaves us with Wayland criticizers' favorite topic - network transparency (which X11 practically doesn't have either, but unfortunately that does little to stop some loud uninformed people):
Remote display software should use low latency video encoding for essentially same user experience as working locally. Preferably hardware accelerated. But even with software, you can encode a frame under 10ms, using for example a subset of h.264. Even if you added network latency, time for one frame network throughput and client display hardware retrace period, you'd still typically end up with a figure well under 50ms. That'd feel essentially local. It'd beat easily X11 over network, VNC, RDP, etc. in latency and thus practical usability. Heck, that'd even beat Xbox 360 or Playstation 3 game display latency when connected to a typical modern TV (70-170ms)! Many TVs do image processing that adds over 50ms of latency before image is actually displayed. (Note that this processing latency has nothing to do with "pixel response time").
Why no one I know of has written remote display software that functions this way is beyond me. Anyone except OnLive and Gaikai, that is...
So, let the old X11 horse have its well-earned rest. It's time to move on.
krh is working on a lossless "codec" that uses a rsync-like rolling hash that should provide good compression with no artifacts (as long as the app isn't too skeuy). http://people.freedesktop.org/~krh/rolling-hash/
...but why? In my experience, using H.264 to encode screen captures gives you extremely high quality video for reasonable CPU and network expenditures. Is the last 1% of video quality really worth it?
For a demo, you can try using FFmpeg + x264 using the x11grab input. It's very easy to adjust the amount of CPU used (by changing the preset) and the bandwidth (by changing the CRF, or by setting the bit rate).
I used NX protocol some tiem ago. It use compression, differential messages, do the X protocol Async, and have a intelligent caching system. It's impressive better that any VNC! I connect to my Desktop (Download 2Mb ,Upload 512Kb connexion at these time) from the University, and was like using my Desktop in my house!
The bad point is that not support OpenGL, so 3d software can be used via NX.
Performant is not an adjective. Performantly is not an adverb. Niether is a word. Use real words when you write. Maybe write, "It didn't work properly / perform adequately," instead of writing nonsense.
> Niether is a word. Use real words when you write.
Aside from the glaring spelling error, this betrays a complete linguistic ignorance coupled with utter historical illiteracy. In short: Every new word, or semantic shift applied to an existing word, was railed against, often by people with no knowledge of linguistics, and many have since become productive parts of our language down to the present day. Some haven't, admittedly, but not because of the people who ranted against them.
For example, is 'television' a word? Can 'nice' be used to mean 'pleasant'? Can 'actor' be applied to a woman? Can a house made of wood ever become 'dilapidated'? (Seriously.)
If you're seriously opposed to 'non-words', the answer to all of those questions is NO.
Isn't that what Airplay Mirror (and the underlying Intel tech doing the encoding on the hardware) does? To the Apple TV, this is merely MPEG4 over HTTP Live Streaming
Can't toolkits use XRender over a network instead of sending huge bitmaps around? Or was XRender support dropped altogether when DRI2 proved so much better?
"XI) “But Eric, if X11 is so terrible why not just make X12 rather than a whole new protocol?” They did, technically anyway: http://www.x.org/wiki/Development/X12
One big problem with keeping it under the “X” umbrella: Anyone who cares about X would have a say in a future version of it. By calling it “Wayland” they avoid that issue. No one cares. Its an unrelated project, they (the developers) can do what THEY want with their future display server, the people who care about X can go to make X12."
Which makes much sense. Given X11 has been around for many many years, albiet various revisions. Heck was only earlier today looking at a old book of mine on X11 from 1989 (X11R4) and recall what a curve it was back then.
So I can understand the legacy hangover aspect and with that moving to a new design/brandname enabled many short cuts in the paperwork and other programming politics aspects.
With that the 25 years is mooted much in this document about how old X is and with that the book I have is a first edition and also around the time which graphics cards started to become available, albiet expensive (recalling a 10k black and white X station by NCR).
Not touched coding on wayland (or indeed X for umpteen years) but would be interesting in how they compare and indeed how they also compare to coding in a standard desktop GUI.
It's the Lisp Problem: If you call your new language by the same name as an existing one, even if you add qualifiers (such as Common Lisp or similar), people will think it's exactly the same and that no progress has been made.
OTOH, if you make a minor change, but give the result a whole new name (Java vs C++, C# vs Java), people will think the result is meaningfully different and take it as a sign of major progress.
Daniel Stone actually did a talk involving much of the same subject matter called "The Real Story behind Wayland and X". I'd recommend that over this article (which was partially written by the same guy). He's actually a really charismatic speaker.
He is quite a dramatic speaker. I'm surprised because 99% of the technical talks you hear have a speaker mumbling away about minutiae hoping enough of the crowd will fall asleep that they can slip out unnoticed.
I don't know what you are listening too, but 99% is misleading. It's quite dependent on the conference. For example, nearly all the presentations on DConf were good. You must be watching something like this: https://www.youtube.com/watch?v=j0fAyL4Xo2k , I guess?
Wow that's so much worse than any talk I've ever seen. The weird angle and terrible audio aren't helping but the dude needs to stop talking to the screen.
I was sold on Wayland in terms of technology a while ago. Where Wayland is losing people is when is it going to be ready to use? Sure, you can install Wayland now, but there are no applications that target it. Yes, XWayland is meant to solve this problem. But, XWayland is not ready yet (or is it?) and Wayland is still "just around the corner." That may be for legitimate reasons, but it's what the article really should address.
In Wayland architecture, the window manager is the display server. To provide wayland-XMonad, you don't need Haskell bindings, you need an implementation of the protocol in Haskell.
This isn't as horrible as it sounds, because unlike X, the display server in Wayland is pretty tiny. The current implementation is like ~10k lines. I expect it to fit in sub 3k lines of Haskell. :)
And this is my primary issue with Wayland. I cannot fathom why anyone would think it's a sound design decision to bundle a hardware-independent component (the window manager) with a hardware-dependent component (the compositor).
This hearkens back to the days of DOS video games – what fun it was to implement support for everyone's sound card! Instead now we'll get to support KMS, Quartz, whatever-the-heck *BSD uses, etc.
Just put a JavaScript (or whatever) interpreter in the window server, and program the window manager locally in that. Then you aren't fucked by synchronization issues. James Gosling did something like that with PostScript many years ago, an alternative to X11, which was then merged with X11, and it was called NeWS (and later X11/NeWS or OpenWindows): http://en.wikipedia.org/wiki/NeWS
I've written several window managers / user interface toolkits / tabbed window frames / menu system in PostScript for NeWS. We even wrote an X11 window manager in PostScript, complete with rooms, scrolling virtual desktop, tabbed windows, pie menus, and seamless integration of X11 and NeWS windows.
The only part that's big enough and self-contained enough to be worth reusing in a Haskell project would probably be the cursor handling. For all the rest, writing the wrappers necessary to deal with them from Haskell would probably be a bigger job than just re-implementing.
The whole point of Wayland/Weston is that the display server is miniscule. All the complex parts are reimplemented in other parts of the stack already, so just let them deal with them and hand over pointers. It's so small that if there ever is a Haskell version of it, I expect it to be formally proven to be bug-free.
My issue with Wayland is that it risks being another KDE 4.0 (or another PulseAudio): the hype-machine was started too early in its development cycle and this is creating expectations that are then frustrated in practice. If you're trying to switch people en masse from such an entrenched technology, you must have a killer app ready from day 1. Is there an app that directly benefits from Wayland so much that it will entice people to switch?
I understand this sort of strategy is difficult in the OSS world, where development is mostly done in the open, but there's a difference between developing and evangelising.
Is there any reason why that approach wouldn't work on X? I read a few of the articles on it and seems like it could be implemented in X if somebody spent the time writing it.
People have spent the time writing it, but it's still too buggy to be deployed. Somehow, it even has a tendency to actually corrupt the SD card. I'm not a GPU driver developer, so take this all with a grain of salt, but my understanding is that X is harder to support because it is such an unwieldy behemoth. Not only do you have to write kernel modules, but you have to write a user-space driver for X (known as a Device Dependent X or DDX driver).
With Wayland, the entire concept of a DDX driver disappears. There is a compositor backend that knows how to talk to the appropriate kernel module, and everything is happy. It's probably this reduced complexity that has made it so much easier to develop an accelerated user interface with Wayland than X.
The problem is that the video system for your computer isn't just a commodity app, it's a huge, interoperative clump of software. It drives the primary interaction device on most computers, and that interaction is a complex beast on every level.
If it was something you could hack up in a basement in a couple of months, we'd already have dozens of choices.
Huzzah! That was pretty much my single biggest disappointment with Wayland. Although I like the visual consistency of server-side decorations, one of my favourite things about X11 is that when an app hangs or otherwise does something weird, the close button in the title-bar still gives me control over the app, I don't have to hope that it's still processing its event loop or bust out some arcane interface like Task Manager and guess what the application's executable is named.
Has anyone ever used those things for more than a single day? I honestly thought the current compositing window managers don't even support that novelty stuff anymore.
In the context that's not the point -- another plus for client-side decorations is that no synchronization is needed between the decorator and client for resizing.
Personally I think client-side decorations are far easier to implement reliably and would welcome them becoming default.
I use wobbling windows because it makes my interaction with the system seem more transparent. Rigid windows feel unnatural, but wobbly windows let me become absorbed by my task.
Back in '94, I was doing a dissertation project in CompSci. I asked people who'd done it for advice. The reply was always the same "Don't use Wanda and don't use X."
Wanda was a research operating system developed by Cambridge University. X was much the same, but developed at MIT.
Can anyone clarify please, how is full OpenGL stack supported in Wayland cases? In X there is libglx, while Wayland relies on OpenGL ES. So how for example would some games which need full OpenGL work on Wayland?
Wayland/Mesa clients can use EGL to get a full GL context, it's just another attribute when creating the context (like how you select between GL ES1 and GL ES2).
The Apple documentation on Quartz 2d is pretty good (although depending on how familiar you are with the rest of the system, you might need to follow some of the links to read about the other parts):
> “X is Network Transparent.” Wrong. Its not. Core X and DRI-1 were network transparent. No one uses either one. Shared-Memory, DRI-2 and DRI-3000 are NOT network transparent, they do NOT work over the network.
This is not true. X11 is network transparent, poorly designed toolkits like GTK are not. So essentially they are writing a new graphics server for Gnome/KDE. But those have never been good X11 citizens anyway.
> Versioning is handled per client, not per bind. So if your app supports one version of a given extension but your toolkit supports another, you can't predict which version of that extension you will get.
Easy solution: open multiple connections. Resources can be shared between connections. (You missed the actual problem with X11 here, which is its current limit of 256 clients. But that's easily fixed with an "X12".)
> III) Many years ago, someone had an idea “Mechanism, not policy.” What did that mean? It means that X has its own X-Specific drawing API,
That's not at all what that means. "Mechanism, not policy" means the X11 core protocol leaves things like window managers and clipboard selection unspecified. (The ICCCM spec takes care of this.) This is sound design.
> it is its own toolkit like GTK+ or Qt.
Wow, not at all. What do toolkits have to do with drawing primitives?
> It defined the low-level things, such as lines, wide-lines, arcs, circles, rudimentary fonts and other 'building block' pieces that are completely useless on their own.
Don't like it? Ignore it and use GLX. X11 is extensible for a reason.
> Media Coherence. Whats Media Coherence? In its simplest terms... Your browser window? That's a window. Your flash player window on youtube? The flash player itself, displaying the video, is a sub-window. What keeps them in sync? Absolutely nothing. The events are handled separately and right now you just pray that they don't get processed too far apart.
WTF? This is exactly what the Sync extension is for.
> “Please generate me a config file........Please actually USE this config file.” Why?? Eventually fixed by making the X-server only use a config file for overrides and making it know and have SANE defaults / auto-detection.
This is an argument against XFree86, not X11. Nothing about X11 dictates XFree86's strange configuration mechanism.
> Who's ever had problems with multiple monitors under Linux? OR ever had to re-setup all of your monitors after a reboot? All X's fault unless you store it in /etc/X11/xorg.conf.d/50-monitors.conf, then it DOES remember it...but you probably had to write that by hand.
Again, WTF does this have to do with X? If your distro is broken and doesn't ship with a decent configuration tool, that will be a problem with Wayland too.
> The window tree is a complete mess. Under X every input and text box was its own window which was parented by the window above it.
Why? Nothing about X11 dictates you must design programs or toolkits like this. Methinks you're confusing "X" with "Athena toolkit".
> Its a nitpick, but its also a valid concern... Under X11, the global pixel counter is 15bits. Which means, between all of your displays you can only have 32,768 pixels.
Shit, no way to fix that without designing a new windowing system from scratch.
> Everything is a window to X, there's no different window types, its just “A window.”
THIS is what "mechanism, not policy" means. X11 doesn't care about window types by design. The ICCCM and EWMH specs are where these things are – by design! – defined! There are different window types, and your window manager is aware of them, without adding needless complexity to the core protocol.
FINALLY: don't get me wrong, there are things wrong with X. However most of the things mentioned in this article are not in that set.
Also, I don't understand the claim "between all of your displays you can only have 32,768 pixels". A 1920x1080 screen has over 2 million pixels (2073600 to be precise). Can anybody explain that part?
Coordinates in the X protocol are signed 16-bit so a screen can run from 0 to 32,767 along an edge. I agree though, the initial complaint is unclear and may be about something else.
Thanks for clarifying. So the following complaint about DPI does not make sense, given that a 50 inches wide screen with a resolution of 600 pixels per inch has 32000 pixels.
OK, I've gone and read the OP's text rather than just the quoted part. It seems he's talking about combining multiple X screens into one logical coordinate space, like I expect one or more of the existing extensions do, e.g. Xinerama.
"But that's easily fixed with an "X12"." -- Ha ha ha! And I would like a pony with that brilliant original idea. I wonder why nobody ever suggested that before. And if that doesn't gain any traction, then solve it with "Y".
""Mechanism, not policy" means the X11 core protocol leaves things like window managers and clipboard selection unspecified. (The ICCCM spec takes care of this.) This is sound design."
XRotateProperties() XCirculateWindow() XRotateBuffers() XStoreBytes() XStoreBuffer() XFetchBytes() XFetchBuffer() and everything to do about window borders, including how they metastasized into the Shapes extension.
One of the fundamental design goals of X was to separate the window manager from the window server. "Mechanism, not policy" was the mantra. That is, the X server provided a mechanism for drawing on the screen and managing windows, but did not implement a particular policy for human-computer interaction. While this might have seemed like a good idea at the time (especially if you are in a research community, experimenting with different approaches for solving the human-computer interaction problem), it can create a veritable user interface Tower of Babel.
If you sit down at a friend's Macintosh, with its single mouse button, you can use it with no problems. If you sit down at a friend's Windows box, with two buttons, you can use it, again with no problems. But just try making sense of a friend's X terminal: three buttons, each one programmed a different way to perform a different function on each different day of the week -- and that's before you consider combinations like control-left-button, shift-right-button, control-shift-meta-middle-button, and so on. Things are not much better from the programmer's point of view.
As a result, one of the most amazing pieces of literature to come out of the X Consortium is the "Inter Client Communication Conventions Manual," more fondly known as the "ICCCM", "Ice Cubed," or "I39L" (short for "I, 39 letters, L"). It describes protocols that X clients ust use to communicate with each other via the X server, including diverse topics like window management, selections, keyboard and colormap focus, and session management. In short, it tries to cover everything the X designers forgot and tries to fix everything they got wrong. But it was too late -- by the time ICCCM was published, people were already writing window managers and toolkits, so each new version of the ICCCM was forced to bend over backwards to be backward compatible with the mistakes of the past.
The ICCCM is unbelievably dense, it must be followed to the last letter, and it still doesn't work. ICCCM compliance is one of the most complex ordeals of implementing X toolkits, window managers, and even simple applications. It's so difficult, that many of the benefits just aren't worth the hassle of compliance. And when one program doesn't comply, it screws up other programs. This is the reason cut-and-paste never works properly with X (unless you are cutting and pasting straight ASCII text), drag-and-drop locks up the system, colormaps flash wildly and are never installed at the right time, keyboard focus lags behind the cursor, keys go to the wrong window, and deleting a popup window can quit the whole application. If you want to write an interoperable ICCCM compliant application, you have to crossbar test it with every other application, and with all possible window managers, and then plead with the vendors to fix their problems in the next release.
In summary, ICCCM is a technological disaster: a toxic waste dump of broken protocols, backward compatibility nightmares, complex nonsolutions to obsolete nonproblems, a twisted mass of scabs and scar tissue intended to cover up the moral and intellectual depravity of the industry's standard naked emperor.
Using these toolkits is like trying to make a bookshelf out of mashed potatoes.
- Jamie Zawinski
Hey, am I actually arguing with the Don Hopkins? Great!
> If you sit down at a friend's Macintosh, with its single mouse button, you can use it with no problems. If you sit down at a friend's Windows box, with two buttons, you can use it, again with no problems. But just try making sense of a friend's X terminal: three buttons, each one programmed a different way to perform a different function on each different day of the week
Beside that this hasn't been true since the mid '90s, I don't buy this argument, especially from you. Uniformity in interface design begets mediocrity, the same way inbreeding begets genetic disorders. I appreciate X's configurability for the same reason I prefer Linux over Windows: I don't like the Windows GUI model. I find it slow and archaic. I don't want to use a Windows GUI clone when I'm trying to code.
The great thing about X's separation of concerns is that I don't have to: I can run a tiling window manager until my friend comes over, at which point I can switch to GNOME or whatever he prefers, and after he's done, I can switch back.
With Wayland, well, either I have to reboot the entire display compositor (does Wayland do this gracefully?) to replace it with a new one, or I better hope I'm running one of the several compositors that have no doubt been developed which allow pluggable window managers, and that my window manager and his both run on it.
Well, I wrote that stuff in the early-to-mid 90's, but you're right that I'm actually in the camp that advocates being able to totally reconfigure the user interface. I just think there are better ways of doing that than how X works, and X only lets you change the window manager, but not anything else.
The thing is that X's configurability sucks (.XDefaults files, and the various ad-hoc syntaxes of the various window manager configuration files, which may or may not run m4 over them so you have yet another crazy macro syntax layer).
Since the user interface toolkit and window manager in NeWS was defined as PostScript classes loaded into the window server and shared by all applications, you could subclass or patch or redefine them (usually when NeWS was starting up), and as long as they still supported the same APIs (which is a big limiting factor on what you could do of course -- pie menus emulated the linear menu api, but a linear menu api is not necessarily the best api for pie menus), then ALL applications would pick up the changes.
Of course there were some things you could change on the fly (like the default menu class, except that if apps cached instances of menus instead of recreating them, they would not change), but you could not redefine the window frames on the fly since they already were created, and reparenting the clients into new frames wasn't trivial.
There was a function to change the default menu class used by the window manager, and it knew how to create new menus with the new classes based on the old menus, for the root menu, and also the window frame menus. But any applications that created menus would have to be in on that game. It would certainly be possible to make a MenuClassChanged event to tell them to recreate all their menus, but nobody ever bothered to do that, since it wasn't something that people needed very often, and would require a lot of work for application developers.
You have to weigh how much the flexibility costs in terms of complexity and efficiency against how much people really need the flexibility, and at what time they need it (immediately at any time, app startup time, server startup time).
X pays a very heavy price in terms of complexity to be able to support changing the window manager on the fly (or rather, plugging in different window managers, without providing a turing complete extension language in the server). And it's not because they necessarily badly wanted you to be able to change window managers at any time you felt like it without restarting the server (which was nice, but not something users were clamoring for), but it was because they just didn't want to dictate any "policy" about how window managers should work.
And why just the window manager, and not switch the entire user interface toolkit -- you still want to do that don't you? I'd rather have an architecture where all applications share the same user interface toolkit that runs in one address space local to the server, and have a consistent and customizable user interface, which is how NeWS worked. That's much more important to me than being able to change the user interface on the fly, in my opinion. And it has other nice side-effects like it does not suffer from network race conditions or even context switching overhead, and all the ui code lives in one place and is not duplicated, which mattered a lot in the days before SunOS supported shared libraries -- Sun actually linked all the common SunView apps together into one gigantic monolithic app that would behave like a different app depending on the name it was invoked with on the command line, so the SunView user interface libraries were shared in memory and started up faster, by virtue of the fact that all the standard SunView apps WERE the same app (SunOS did at least support multiple instances of the same app sharing read-only code).
Why paste the link content here? Most people already read it, it's nothing new. I can go find 10 X quotes too, but a link should suffice. Quote pasting is no discussion.
I pasted just the part about ICCCM, because it directly addressed the point I was trying to make, and it doesn't have link anchors. I didn't assume everyone would want to wade through the entire document to get to the part about ICCCM. If most people already read all of that Unix-Haters chapter which I wrote in 1993, then that's news to me. Obviously the guy who repeatedly parroted the 20-some-year-old discredited "Mechanism not Policy" line hadn't read my criticism.
But don't lose any sleep over it: computer networks and storage devices are fast and large enough these days that it's not going to cause any meltdowns, so it's more important to save people time, than to conserve the bandwidth and disk space consumed by a few lines of text. Sorry to have wasted YOUR time that it took you to post a complaint.
No need to apologize, I was just genuinely curious. Considering you wrote it yourself, that makes it more of a discussion in my eyes. Bandwidth was never the issue. No hard feelings :)
I loved the Unix Hater's Handbook. Still have the book, still have the barf bag. Still think it's one of the best books on operating systems design ever written.
You might be right! The core issue of the ICCCM though is not that it's poorly designed in and of itself, but that it is forced to worked around certain specific deficiencies in the X11 protocol (and by "specific" I mean "easily fixed in the next protocol bump", as opposed to "warrant an entire redesign"). The ICCCM authors even suggest improvements to X11 to make for a more coherent system, but sadly it seems most of them have not yet (25 yrs on…) been implemented.
My overall thesis is that the overall design of X (client/server model, mechanism not policy, extensibility) is a sound one, and that the issues cited by Wayland proponents would be better addressed by an evolution of X than a whole-sale replacement.
Maybe they should just recognize the fact that the web has won and nobody cares about X any more, and implement a window system that runs in the web browser, and run the web browser in full screen direct access mode as the only top level application. I mean, now that we have HTML and WebGL and Canvas, who needs X11's neanderthal rendering model?
I'm kinda surprised that nobody's written an X server in JavaScript yet (not that it's what I'm proposing above, just that people love to do crazy useless things like that).
I've occasionally wondered as a crazy hack has anyone ever implemented the VNC window system? The API/interface is squirting out a stream as would be seen over the network on VNC? Client has full control?
Much as the simplest way to get cross platform cross browser pixel perfect web page rendering is of course a really big imagemap and skip all that large, slow html and css stuff, the simplest way to implement a windowing system might just be a VNC viewer that can render many simultaneous possibly overlapping streams.
I've done basically this (technically RFB is the protocol VNC uses, but whatever). The team I was on makes an auditing tool that records RFB traffic, transcodes it into MPEG video and also does real-time compositing in RFB. Initially it was just static boxes to obscure stuff, but we started doing messaging as well, until eventually we had a library to write arbitrary strings to the screen. I always wanted to extend it to accept user input as well, but that ventured into X territory too much.
It's worth pointing out that adding new data to an existing RFB stream with any kind of speed is stupid hard. The server has the ability to send one of about 11 types of message, from simple - a bitmap or RLE - to stupid - hextile and tight are popular. The only way we found was to parse every message, update a frame buffer, then re-encode the updated framebuffer. Not to mention all of the clients and servers have slightly different implementations, so even though you should be able to implement a subset of the spec, you end up implementing the whole spec, plus kludges for every popular client and server. Particularly egregious is Jolly's Fast VNC, which is probably the best Mac VNC client, but it actively rejects servers which are within spec to accommodate one particular server the dev targeted.
One key feature of X that Wayland refuses to implement is the concept of remote access. Everything is intended to be local only. For those "strange" users out there who want remote access, the developers replies has so far been to use VNC.
Its kind of odd that remote access has been shoved to the side in this age of cloudiness and always connectedness. One would think that there existed better methods to remote access then just copying the image buffer and compress it.
From page 3, "Some Misconceptions about X and Wayland":
II) “X is Network Transparent.” Wrong. Its not. Core X and DRI-1 were
network transparent. No one uses either one. Shared-Memory, DRI-2 and
DRI-3000 are NOT network transparent, they do NOT work over the
network. Modern day X comes down to synchronous, poorly done VNC. If
it was poorly done, async, VNC then maybe we could make it work. But
its not. Xlib is synchronous (and the movement to XCB is a slow one)
which makes networking a NIGHTMARE.
And later:
V) “Wayland can't do remoting.” Wrong. Wayland should be BETTER than X
at remoting, partially do its [sic] asynchronous-by-design nature. Wayland
remoting will probably look a like a higher-performance version of
VNC, a prototype already exists. And this is without us even giving it
serious thought about how to make it better. We could probably do
better if we tried.
So, Wayland will implement remoting in a way similar to VNC, which will be an improvement over the current state of things in X anyway.
From what I understand, modern X applications run remotely are just painting a region of bytes to be shot over the wire anyway. I'll borrow X terminology for a second here, what functional difference is there between these two approaches:
I get the feeling that the pushback (not necessarily yours, but in general) might be rooted in the visual interface where an X server will manage and composite remote applications as if they were local but vnc and rdp (in most use cases) offer a window to the remote system. As I understand it, the former presentation style isn't precluded by wayland's approach.
Wayland's design for a client and server on the same machine doesn't do much more than pushing a buffer full of painted bytes to the server. So anything fancier than that isn't something they refuse to implement for just a network connection, they don't do it at all.
> From what I understand, modern X applications run remotely are just painting a region of bytes to be shot over the wire anyway.
No, that's incorrect. X server sends high level commands over the network and let the client render the requests. VNC on other hand work exactly as you describe, by simply copying the servers image buffer (a region of bytes), compresses it, and sends it over to be painted on the client's screen.
My general approval of X method rather then the VNC approach mostly comes down to performance (less traffic, smother rendering) and style. Its the same reason why I like the concepts of vector graphic over pixel formats.
But if Wayland is designed to not have any high level commands (like "create a window"), and thus only push buffers of pixels between server and client, I guess that is that then. Anything beyond copying the image buffer would then be in conflict with Waylands core design.
The X way of doing window updates became impractical with common graphics-heavy applications like web browsers. I could already feel this in the mid-90's when opening a browser window on a 19" NCD X-Terminal took several seconds over 10mbit/s Ethernet (noticeable because most other X client applications at that time used drawing primitives and appeared instantly).
For modern applications (graphics- and video-heavy), VNC/RFB seems like a much more suitable protocol, although I haven't really kept up with the latest X extensions that probably cover some of these applications.
Personally, I liked the X architecture more (despite X's inherent complexity) and expected it to become more popular with cloud-hosted applications etc., but as applications and GUI toolkits evolved in an unsuitable way (i.e. not using X's primitives), it's probably time to let go. Or to build a proper GUI toolkit first...
User -> Interface -> X Server -> Network -> X Client
The huge functional difference (in terms of what you describe - no idea if that's how Wayland intermediary whatever will actually work) is that with X you do not need an X server running or even configured or even installed on the system the client application is running on - only on the system it is displaying to. This may or may not be significant, depending on how lightweight and portable and easily (and flexibly) configurable the Wayland server winds up being.
But really.. as nice as X11 network is, it's mostly just as easy to use vnc, nx, rdp, whatever.
If that means a faster, lighter, better displayserver, go ahead, i say.
You shouldn't believe everything you read. The tone of the article is very suspicious: the author will admit nothing bad about Wayland. In particular you might want to examine the claim that VNC style remoting is better than what X does, or that X remoting is a bad VNC.
In actuality VNC is equivalent to the worst-case option X has in terms of rendering. If you wanted something else it's just a bad joke.
Any comment on the fact that now ALL modern X rendering is worst-case? Why bother having all that useless complexity (have you ever read how the SHAPES extension deals with window borders?)?
X's worst-case isn't ever the whole picture. Even if most applications do a lot of drawing in the worst-case mode that doesn't rob them of the richness of communicating more useful detail as well. Even when you're "just" getting a big picture you're still getting something much more than a picture; it's a real window and it's really connected to your X server in a meaningful way. VNC has a real window featuring real pictures of real windows that are really not connected to you.
X is by all means a Rube Goldberg machine and nobody's contesting that. That doesn't mean throwing the baby out with the bath water is the better way to solve the dirty baby problem, even if it is easier—who needs soap when you can just eliminate the baby?
My understanding is that X can run as a client on top of Wayland and thus you "strange" users can run remote X applications.
"One would think that there existed better methods to remote access then just copying the image buffer and compress it."
Web applications fit the bill nicely.
It's usually snappier, because the client has some insight in what it's displaying and so it can accelerate stuff here and there; it also allows for root-less mode, where individual remote windows appear as normal windows to the local OS, rather than being forced to live in a monolithic "remote desktop" window. VNC is basically like a movie player: the local OS doesn't really know what the movie is displaying (in conceptual terms), and it cannot interoperate with it in any meaningful way.
The problem with the X approach is that client and server both do a lot of duplicated work ("oh, you're drawing a window, lemme put a nice border on it for you!" "Er, actually I was going to draw a different border, sorry." "Oh, ok, fine, I'll do what you say") and use "standards" protocols that have been hacked to death over the last 20+ years. It's very inefficient (although somehow it still feels faster than VNC in many cases, don't ask me why) and very hard for developers working on graphic subsystems (toolkits, window managers etc).
"oh, you're drawing a window, lemme put a nice border on it for you!" -- That is tragically one of the worst, most useless, and rarely used examples you could come up with for something the X server is capable of doing for you (at the expense of a lot of complexity -- read the X SHAPES extension documentation about how it supports shaped window borders) ... as long as you like black-and-white tiled 1 bit deep pixmap borders.
The point of the example was to show the duplication of efforts between X client and X server, which is actually in lots of places -- drawing borders, drawing backgrounds etc -- in a simple way. I'm not an X developer, I was just an unfortunate user of XFree86 and then Xorg, then I gave up and bought a Mac.
>I was just an unfortunate user of XFree86 and then Xorg, then I gave up and bought a Mac.
Same here. I am able to get work done on an X11-based desktop, but the experience is definitely more aggravating than working on OS X. (OS X is slower than X11 + Linux and seems to have a weakness relative to Linux in which file IO slows down processes that aren't even doing IO, but that's not as aggravating as what X11 does.)
And that, for instance, means that your simulation running on a huge CPU cluster that does not even have a GPU can run interactive graphics on your desktop GPU.
I am not sure whether that still is a big advantage nowadays, as the typical cluster is fast enough to do OpenGL in software, but 20 years or so ago, this was a big issue, as that huge cluster might have been a 4 CPU, 2 GB, 100 MHz machine, if you were lucky.
I'm told that sending the native code is smaller than sending pictures of the desktop - meaningless on your home network, but it has ramifications on a busy network.
X11 has been a reliable workhorse, but its time is up - simply too much cruft accumulated over the years that's not even used anymore. Yet all of it needs to be continually supported, adding to complexity. No one uses X11 primitives for drawing apart from bitmap functionality - even repainting dirty regions (expose events) often involves sending over a new bitmap and using X11 to draw it. This is very inefficient.
To implement a reasonably fast GUI, X11 has essentially resorted to hacks (extensions). DRI2 (+GLX) is probably the most important of those. AFAIK, it's what almost everything uses for drawing, and does not work over network at all. Yes, modern X11 is local only. If you're on a network, it's back sending those uncompressed bitmaps. Even with all these hacks, X11+DRI2 can't even maintain tearing free display. Well, at least DRI3 should fix tearing...
So if none of modern software needs nothing but a bitmap surface to draw on, why implement and maintain anything else?
Which leaves us with Wayland criticizers' favorite topic - network transparency (which X11 practically doesn't have either, but unfortunately that does little to stop some loud uninformed people):
Remote display software should use low latency video encoding for essentially same user experience as working locally. Preferably hardware accelerated. But even with software, you can encode a frame under 10ms, using for example a subset of h.264. Even if you added network latency, time for one frame network throughput and client display hardware retrace period, you'd still typically end up with a figure well under 50ms. That'd feel essentially local. It'd beat easily X11 over network, VNC, RDP, etc. in latency and thus practical usability. Heck, that'd even beat Xbox 360 or Playstation 3 game display latency when connected to a typical modern TV (70-170ms)! Many TVs do image processing that adds over 50ms of latency before image is actually displayed. (Note that this processing latency has nothing to do with "pixel response time").
Why no one I know of has written remote display software that functions this way is beyond me. Anyone except OnLive and Gaikai, that is...
So, let the old X11 horse have its well-earned rest. It's time to move on.