| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kannanvijayan 2850 days ago

> Even without any analysis code, just adding these calls can have a runtime overhead >30x. I would like to optimize this, but before that I need to find our where the overhead is coming from.

That thought entered my head immediately as soon as I noticed you were instrumenting every instruction.

> - that many calls are just inherently expensive, be it cross-language or not (possible solution: be more selective about when to insert calls to analysis hooks)

This is definitely true, and hard to get around. The wasm instructions will be compiled to machine instructions, and the calls will still be calls, and calls are expensive.

One possible approach to mitigate this cost might be to collect and batch calls into the hook functions. Basically your instrumentation would be a trace-dump of execution and data to some in-wasm memory, and periodically you call out to JS for analysis once the buffer fills up.

This should reduce the call overhead and replace it with a single write to a well known location.

Now, if your analysis functions expect to be able to peek at memory and get a consistent view of memory at the time of the instruction being analyzed, you'll need to do some special magic to re-compute the memory state at that time from the recorded trace, but that can be done on-demand when analysis requires, so 0 cost if the hooks are not present.

Please note that I'm not sure how well this would work exactly, but it seems promising.

> - Wasm <-> JavaScript calls are more expensive than Wasm <-> Wasm ones (possible solution: compile analyses to Wasm, or: hope that this gets optimized better by engines in the future)

It's getting optimized now. My impression is that the big cost here is marshalling wasm numbers into JS values. I don't know of a good way to avoid this aside from not calling into JS when you can avoid it (i.e. you know there are no analysis hooks attached to something).

I wonder if a simple runtime flag check within wasm, guarding the call-out, would significantly reduce the overhead cost.

> - the added instructions inhibit some wasm compiler optimization(s) (e.g., inlining is no longer performed because the function bodies are larger than some threshold)

This shouldn't be the case too much. Most of the heavyweight compiler opts happen before emission to wasm, including a good chunk of inlining. I'm not even sure if Odinmonkey (our Wasm impl) does any extra inlining on top of that - it might just expect the compiler to take care of that.

I'll get in touch. I think you'd get more confident answers on these from the direct WASM crowd. My answers are a bit speculative, and lack concrete details about the latest implementation status.