Hacker News new | ask | show | jobs
by gjem97 3121 days ago
What parts of FF 57 are written in Rust? Just Stylo?

Edit: I don't intend for this to sound like I'm complaining, just interested.

2 comments

Stylo is new in Firefox 57, but Mozilla has shipped other Rust code in earlier Firefox versions:

https://wiki.mozilla.org/Oxidation#Rust_components_in_Firefo...

Completed:

  MP4 metadata parser (Firefox 48)
  Replace uconv with encoding-rs (Firefox 56)
  U2F HID backend (Firefox 57)
In progress:

  URL parser
  WebM demuxer
  WebRender (from Servo)
  Audio remoting for Linux
  SDP parsing in WebRTC (aiming for Firefox 59)
  Linebreaking with xi-unicode
  Optimizing WebVM compiler backend: cretonne
Can anyone explain what a URL parser does and why it's so complex? I feel like there's a whole interesting story lurking there.
The reason the URL parser work is taking long is not because it's complex, rather it's because it's stalled. URL parsing is complex, however all this complexity was already dealt with when the Servo team wrote the rust-url crate ages ago, so it's not a factor here.

The URL parser integration was a proof of concept. It doesn't really improve stuff (aside from a slight security benefit from using Rust) so there wasn't pressure to land it; it was just a way of trying out the then-new Rust integration infra, and inspiring better Rust integration infra.

One of the folks on the network team started it, and I joined in later. But that person got busy and I started working on Stylo. So that code exists, and it works, but there's still work to be done to enable it, and not much impetus to do this work.

This work is mostly:

- Ferreting out where Gecko and Servo don't match so that we can pass all tests. We've done most of this already, whatever's left is Gecko not matching the spec, and we need to figure out how we want to fix that.

- Performance -- In the integration we currently do some stupid stuff wrt serialization and other things; because it was a proof of concept. This will need to be polished up so we don't regress

- Telemetry -- before shipping we need to ship it to nightly in parallel with the existing one and figure out how often there's a mismatch with the normal parser

It's not much work, but everyone is busy.

A URL parser takes a string with a URL in it, and returns some sort of data structure that represents the URL.

It's complex because URLs are complex; I believe this is the correct RFC: https://tools.ietf.org/html/rfc3986 It's 60 pages long.

(That said, page length is only a proxy for complexity, of course)

And, as a bonus, there's the other URL standard, which describes what browsers actually do:

https://url.spec.whatwg.org/

As someone who once tried to write code to do it to avoid pulling in a dependency.

Never again, it's not just that the spec is 60 pages long but that the actual behaviour out in the real world is miles away from the spec, the web is a complex place where standards are...rarely standard.

When writing code it's a much better idea to write according to https://url.spec.whatwg.org/
URLs have been a security issue for browsers in the past, and can get pretty hairy. From UTF-8 coded domain names to whatever you want to "urlencode". For example, you can encode whole images into URLs, for embedding them in CSS files.

Old IE versions had a hard URL length limit and were very picky with the characters in domain names, both limitations included as "security fixes" (which broke the standards).

Stylo is the biggest and most significant thing; there are some smaller bits (a media parser, and something else?) included before 57.
I'd say the change of the encoding stack to encoding-rs is pretty significant; while it's not that much code it's stuff that gets used throughout the codebase.
That's fair, and you know the impact better than I!