Hacker News new | ask | show | jobs
by DaiPlusPlus 1869 days ago
> I imagine using canvas gave them a lot more flexibility in addition to being more performant.

I'm perplexed because I don't expect canvas rendering to be faster - or necessarily more flexible - because the web is document-first: HTML and CSS were/are all built-around describing and styling textual content, and computer program source code files are invariably all textual content files. So while browsers all have heavily-optimized fast-paths written in native code for rendering the DOM to the screen with the full flexibility of all of CSS's styling features - so applications switching to canvas rendering will first have to contend with needing to reimplement at least the subset of CSS that they're using for their editor - and it has to run as JavaScript (or WASM?) - and I just don't understand how that could possibly be faster than letting the DOM do its thing.

I appreciate that DOM+CSS rendering is not designed-around monospaced text editing or with specific support for typical text-editor and IDE features which do indeed throw a wrench into the works[1], but I think a much better approach would be to carve-out the cases where the current DOM and rendering model is insufficient or inappropriate for those specific applications' purposes and find a way to solve those problems without resorting to canvas rendering.

That said, is this change because Google wants to use Flutter for a single codebase for Google Docs that would work across iOS, Android, and the web? Flutter does have a HTML+DOM+CSS rendering mode, but it's horrible (literally thousands of empty <div> elements in their hello-world example...)

[1] e.g. a HTML/DOM document is strictly an unidirectional acyclic tree structure, and CSS selectors are also strictly forwards-only (e.g. you cannot have a HTML element that spans other elements, you cannot isolate individual text characters, you cannot select a descendant element to style based on its subsequent siblings, or ancestor's subsequent siblings), and how the render-state of a document is also strictly derived from the DOM and so does not allow for any feedback loops unless you start to use scripts, which means you can't select elements to style based on their computed styles (unlike, for example, WPF+XAML, where you can bind any property to another property - something I think XAML implements horribly...), and I appreciate this makes certain kinds of UI/UX work difficult (if not impossible in some cases), but in the use-case of an editor I just don't see these as being show-stopper issues.

6 comments

>I'm perplexed because I don't expect canvas rendering to be faster

...yet it is. Really.

Even though DOM paths are heavily optimized, they are extremely flexible, and that flexibility creates a wall in possible performance optimizations. In a context like word processor, precision is more important than your regular website (and across browsers!) so you end up implementing little hacks everywhere, pushing half a pixel here and another 1.5 pixels there.

A purpose built engine that writes directly to the framebuffer of a canvas without dealing with legacy cruft has the potential to be a lot faster - if you know what you are doing. Google has no shortage of devs who know what they are doing so here we are.

They aren't that optimized, this small team changed Chromium's DOM to have better cache utilization and more coherent access patterns with data-oriented/SoA and got 6X speedup in some animation use cases:

https://meetingcpp.com/mcpp/slides/2018/Data-oriented%20desi...

> Google has no shortage of devs who know what they are doing so here we are.

They also have no shortage of devs who advance crazy ideas that somehow gain adoption... like starting a new general-purpose programming language in 2007 without generics nor package manager.

The web is (or at least was) document-first, yes, but Google Docs is an extremely heavily-featured WYSIWYG word processing and desktop publishing application that happens to be distributed on the web (in addition to other platforms). The fact that you're (sometimes) using Google Docs to generate a simple document that could easily be represented with simple HTML does not imply that Google Docs itself is a natural candidate for being implemented with simple web APIs like DOM.

Now, I think if the contentEditable API were significantly more robust and consistent across browsers, it could have been viable to build extremely complex WYSIWYG editors using the DOM. Most of the popular rich text editor libraries for the web are essentially compatibility layers around the contentEditable API that attempt to normalize its behavior across browsers and present a more robust API to the developer. These libraries are popular and do work pretty well, but based on my experience with them it's no surprise that an app as popular and extensive as Google Docs would constantly bump into the limitation of this approach. (My impression is that Google Docs never used contentEditable and instead wrote their own layout and editing engine that manually rendered out DOM, and they're now changing that to render out to canvas.)

> My impression is that Google Docs never used contentEditable and instead wrote their own layout and editing engine that manually rendered out DOM, and they're now changing that to render out to canvas.

Back before Google owned Google Docs, it was a non-Google company and website called Writely, and their website was basically a document-hosting system tied to a fairly stock `contentEditable` editor.

This was around 2005 - back when every web-application development client would insist that users have WYSIWYG/rich-text editors - of course they had no idea how WYSIANLWYG (What you see is absolutely nothing like what you'll get) those WYSIWYG editors are like.

> HTML and CSS were/are all built-around describing and styling textual content, and computer program source code files are invariably all textual content files.

HTML and CSS are fairly well optimized, but dynamic HTML and the DOM were an afterthought. If you could throw out a lot of the guarantees about DOM behavior, you could make a much faster browser, but you'd also break the web.

I could see how this could be faster.

At the end of the day, after the browser does all of its highly optimized processing of the dom, html, and CSS, it is issuing drawing commands that are the same as the ones you make on canvas. Canvas skips the in-between steps.

If you're in a situation where you know you want this text at this location on the page, it may be simpler to just draw what you want verses trying to arrange a DOM that will cause the browser to draw what you want. Especially if you're already doing pagination, at which point you're already doing the text breaking and layout anyway, and you're just trying to tell the browser in a high level language to give you the same low-level results that you already have in hand.

It looks like they're just doing this for text within a page, BTW. I looked at the sample document and the page scroller is DOM, and the individual pages are canvas of text, overlaid with an SVG containing the images.

The big question I have is how they manage to deal with stuff like IME (input method editors) and how they manage to work with the keyboard on mobile (looks like they don't do mobile though).

> The big question I have is how they manage to deal with stuff like IME (input method editors) and how they manage to work with the keyboard on mobile (looks like they don't do mobile though).

A common technique used in other web-based editors for other content-types (like online video editing, online image editors, etc) work by creating a hidden <textarea> or <input type="text"/> and giving that element focus - and then updating the manually-rendered content in response to normal DOM events like 'input', 'change', 'keydown' (if necessary - the 'input' event should be preferred, ofc). Because a "real" DOM element with native IME and soft-keyboard support is being used to process user input there's little to no degredation of the user-experience.

...though the user does lose the ability to do things like drag text-selection handles. Alternative approaches include instead making the textarea very visible and instead positioning it directly on-top of the manually-rendered content and using as much of the browser's built-in support for styling input elements and input text to match the manually-rendered content as closely as possible - but also hiding the manually-rendered content to avoid confusing the user. They may have a toggle to allow the user to choose between "simple-edit with live preview" (i.e. hidden textarea) and "edit mode". This technique isn't confined to just the web: lots of desktop software (especially in the days before WPF, JavaFX, etc) that needed to allow the user to precisely edit text within a design-surface would just instantiate a native textbox widget directly on-top of the text's location in the design-surface. It wasn't just 2D art software that did this, but also at least a few WYSIWYG-ish HTML editors (prior to contentEditable) did this. I actually wish this technique would come back (despite its clunkiness) simply because Markdown+Preview is far, far better than a WYSIWYG contentEditable widget where an inadvertant mouse-click or drag would create a `float` disaster - or bugs where elements wouldn't be closed correctly and so ending-up breaking the entire website layout...

> because the web is document-first: HTML and CSS were/are all built-around describing and styling textual content

They were built to display static textual content. Moreover, they were built to display static textual content on 90s-era computers in a single rendering pass. IIRC two-pass rendering didn't appear until some improvements around tables in early 2000s.

For that, yes, they are quire fast. Anything else? Nope.

Document display and document editing are rather different tasks. The DOM was built for the display of static documents. Dynamism was slowly added over the years through JS, and eventually CSS (animations, transformations, etc). But the underlying purpose of the browser rendering engine has remained the same, which is to display static documents. It's not surprising that a client built from the ground up around the concept of displaying static documents doesn't do a good job of allowing users to edit documents in a WYSIWYG kind of way. That has never been its job!