Hacker News new | ask | show | jobs
by spankalee 299 days ago
Another way to look at this is:

Billions of people use the web every day. Should the 99.99% of them be vulnerable to XSLT security bugs for the other 0.01%?

5 comments

That same argument applies to numerous web technologies, though.

Applied to each individually it seems to make sense. However the aggregate effect is kill off a substantial portion of the web.

In fact, it's an argument to never add a new web technology: Should 100% of web users be made vulnerable to bugs in a new technology that 0% of the people are currently using?

Plus it's a false dichotomy. They could instead address XSLT security... e.g., as various people have suggested, by building in the XSLT polyfill they are suggesting all the XSLT pages start using as an alternative.

depends entirely on which technologies are acctively addressing current and future vulnerabilities.
The vulnerabilities associated with native client-side XSLT are not in the language itself (XSLT 1.0) but instead are caused by bugs in the browser implementations.

Ps. The XSLT language is actively maintained and is used in many applications and contexts outside of the browser.

If this is the reason to remove and or not add something to the web, then we should take a good hard look at things like WebSerial/WebBluetooth/WebGPU/Canvas/WebMIDI and other stuff that has been added that is used by a very small percentage of people yet all could contain various security bugs...

If the goal is to reduce security bugs, then we should stop introducing niche features that only make sense when you are trying to have the browser replace the whole OS.

whatever you do with xslt you can do it in a saner way, but whatever we need to use serial/bluetooth/webgpu/midi for there is no other way, and canvas is massively used.
I'd love to see more powerful HTML templating that'd be able to handle arbitrary XML or JSON inputs, but until we get that, we'll have to make do with XSLT.

For now, there's no alternative that allows serving an XML file with the raw data from e.g. an embedded microcontroller in a way that renders a full website in the browser if desired.

Even more so if you want to support people downloading the data and viewing it from a local file.

If you're OK with the startup cost of 2-3 more files for the viewer bootstrap, you could just fetch the XML data from the microcontroller using JS. I assume the xsl stylesheet is already a separate file.
I don't think anyone is attached to the technology of xslt itself, but to the UX it provides.

Your microcontroller only serves the actual xml data, the xslt is served from a different server somewhere else (e.g., the manufacturer's website). You can download the .xml, double-click it, and it'll get the xslt treatment just the same.

In your example, either the microcontroller would have to serve the entire UI to parse and present the data, or you'd have to navigate to the manufacturers website, input the URL of your microcontroller, and it'd have to do a cors fetch to process the data.

One option I'd suggest is instead of

    <?xml-stylesheet href="http://example.org/example2.xsl" type="text/xsl" ?>
we'd instead use a service worker script to process the data

    <?xml-stylesheet href="http://example.org/example2.js" type="application/javascript" ?>
Service workers are already predestined to do this kind of resource processing and interception, and it'd provide the same UX.

The service worker would not be associated with any specific origin, but it would still receive the regular lifecycle of events, including a fetch event for every load of an xml document pointing at this specific service worker script.

Using https://developer.mozilla.org/en-US/docs/Web/API/FetchEvent/... it could respond to the XML being loaded with a transformed response, allowing it to process the XML similar to an XSLT.

You could even have a polyfill service worker that loads an XSLT and applies it to the XML.

Of course there is a better way than webserial/bluetooth/webgpu/webmidi: Write actual applications instead of eroding the meaning and user expectations of a web browser. The expectation should not be that the browser can access your hardware directly. That is a much more significant risk for browsers than XSLT could ever be.
Solutions have been proposed in that threads, including adding the XSLT polyfill to the browser (which would run it in the Javascript VM/sandbox).
If the usage/risk of XSLT is enough to remove it, you'd have to remove webusb, webbluetooth, webmidi, webxr, and countless more
Yes, please.
Tbh, I'm still hoping we can get rid of these ridiculous webusb/bluetooth/etc specs and redirect the funding to libxslt instead.
Don't threaten me with a good time!
Isn't this something that could be implemented using javascript?

I don't think anyone is arguing that XSLT has to be fast.

You could probably compile libxslt to wasm, run it when loading xml with xslt, and be done.

Does XSLT affect the DOM after processing, isn't it just a dumb preprocessing step, where the render xhtml is what becomes the DOM.

It could be. The meaningful argument is over whether the javascript polyfill should be built into the browser (in which case, browser support remains the same as it ever was, they just swap out a fast but insecure implementation for a slow but secure one), or whether site operators, principally podcast hosts, should be required to integrate it into their sites and serve it.

The first strategy is obviously correct, but Google wants strategy 2.

As discussed in the GitHub thread, strategy two is fundamentally flawed because there’s no other way to make an XML document human readable in today’s browsers. (CSS is close but lacking some capabilities)

So site operators who rely on this feature today are not merely asked to load a polyfill but to fundamentally change the structure of their website - without necessarily getting to the same result in the end.