| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by sphink 4297 days ago

[SpiderMonkey dev here. I remember the talk you gave at Mozilla, quite a good talk. I work on the GC much more than the compiler, which I don't know that much about, but I'm only going to talk about the compiler. Ironic, given that you were talking about the GC.]

What's your actual goal here? You gave an example of a pathological function that takes a long time to compile. Does that matter?

We care a lot about compilation time. It matters a lot for Web workloads, and distinguishes JS compilers from AOT compilers. But why do you care, especially in this particular case? Do you have "real" code that is hitting the same bottlenecks? Because if not, you have two easy solutions to this case: first, leave everything as is. In a production system, you can be doing this sort of compile in the background, and only switch over to it from the fast-start interpreter when it's done. The second straightforward option is to just give up. Fail to compile if things get too big or slow, and let it run in the interpreter (or some other lower tier.)

If this does resemble a realistic scenario, then we can get a little more nuanced. We face the same problem in SpiderMonkey -- big functions can take too long to compile and/or will blow out memory (which, honestly, is the bigger concern. More on that later.) In JaegerMonkey days, we had "chunked" compilation, where it artificially broke big functions down into pieces and compiled them separately, essentially spilling all state between them. I know we've talked about doing the same for IonMonkey; I'm not sure if it has been implemented. In particular, asm.js hits this a lot because autogenerated functions can easily get enormous. Luke has definitely talked a lot about chunking asm.js compilation; I don't know if he ever got around to implementing it yet. [You remember Luke; you sat behind him for a while in the Mountain View office.]

Or how about doing it at a finer granularity? We have a number of places where we start out trying to do the smart and clever thing, but when things gett out of hand we abort and fall back on something stupid but fast and small. Here, that might mean being very pessimistic about your live ranges and pessimistically chopping them up into small pieces. Could you keep your 2d matrix, but cap its dimensions? You only track up to K variables at a time, and if you exceed K you spill one and reuse its slot for the new one. Obviously, your phi nodes (join points) would need to agree on a set of variables, so maybe this gets too complicated or just doesn't work.

But anyway, I'm still wondering what you're after. You have a research compiler. There is no need for it to be better than a production compiler in all ways. Pick something it is better at, such as producing really fast code. The learnings from that will be more likely to be useful in a production system anyway, because all current leading JS compilers are multi-tier -- you have a fast-start interpreter, usually a still pretty quick-start baseline compiler, and then an optimizing compiler. The distribution of static code across those is something like 90%/9%/1% anyway -- the majority of downloaded code is executed at most once anyway, so there's no point in spending time optimizing it. But that means the optimizing compiler is free to spend some time thinking about things. If your compiler is really slow, just claim it's meant for a 4th tier that only kicks in for really hot code. :) And it can compile in the background.

Unless it burns too much memory. That won't stay isolated from the execution thread. It'll pollute caches, hog the memory bus, and make fun of your mother. So I'd worry more about capping memory usage to a reasonable level than worrying about compile time, even if it means giving up or dropping down to a dumber algorithm when things start to get big.

One other effect of being a later tier is that you can stick to compiling simpler programs. By the time it makes it to your tier, you'll have observed the actual types and behavior of the function for a while, and so you can assume that it'll continue to behave the same way. This allows you to ignore many of the numerous quirks of the JS language. You don't need to handle getters if your hot code never sees them (or rather, you only have to handle them in the parts that do see them.) The global object probably won't get mutated, and almost certainly won't have properties deleted. You will need mechanisms to discard your compiled program if any of these things change, and you'll have to be clever about noticing these sorts of mutations, but it's great to be able to assume that nothing too crazy will happen. Even if you have to guard all those assumptions.

The GC does matter for both time and space, though. An advantage of a generational collector isn't just shorter pauses, it's also that you can use simple bump allocation in the nursery since you never need to do partial deletes. Also, if you're creating a lot of short-lived junk then it can reduce your peak memory usage by quite a bit, which is important for a 3rd or 4th tier background compiler.