| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by josephg 1151 days ago

Rust doesn't have to output more code than a C compiler. But it tends to because most rust programs are stuffed full of bounds checks. And bounds checks aren't small. As well as the conditional itself, every bounds check also includes:

- A custom panic message (so you know which line of code crashed)

- Some monomorphized formatting code to output that message

- The infrastructure to generate a stack trace after a panic

- Logic to free all the allocated objects all the way up the stack

If you compile this 1 line function:

    pub fn read_arr(arr: &[usize], i: usize) -> usize {
        arr[i] // (equivalent to 'return arr[i];')
    }

... You produce 20 hairy lines of assembler: https://rust.godbolt.org/z/dhz34KEvj

In contrast, the equivalent C function is this rust code:

    pub fn read_arr_unchecked(arr: &[usize], i: usize) -> usize {
        unsafe { *arr.get_unchecked(i) }
    }

And predictably, the result is this gem - identical to what the C compiler outputs:

    example::read_arr_unchecked:
        mov     rax, qword ptr [rdi + 8*rdx]
        ret

But nobody writes rust code like that (for good reason). You can get a lot of the way there by leaning heavily using rust's iterator types and such. But its really difficult to learn what patterns will make the rust compiler lose its mind. There's no feedback on this at compile time, at all.

1 comments

mwcampbell 1151 days ago

I'd appreciate any more tips or resources you might have about reducing Rust code bloat. I want my library [1] to be acceptable to the most strident anti-bloat curmudgeons, so they'll make their UIs accessible.

[1]: https://github.com/AccessKit/accesskit

josephg 1151 days ago

I don’t know many good resources to learn this stuff unfortunately.

The things I reach for in practice are godbolt and cargo asm[1] - which can show me the actual generated assembler for functions in my codebase. And twiggy[2], which can tell you which functions are the biggest in your compiled binary and point out where monomorphization is expensive.

When I’m developing, I regularly run a script which compiles my code to wasm and tells me how the wasm file size has changed since the last time I compiled it.

Some tips:

Try to avoid array lookups with an index when you can. When looping, use slice iterators and when making custom iterators, wrap the slice iterator rather than storing a usize index yourself.

Be careful of monomorphization. If you’re optimising for size, it can be better to take a dyn Trait rather than making a function generic.

And play around with your wasm API surface area. It takes a lot more code to pass complex objects & strings back and forth to javascript than other types.

But otherwise, good luck! Love the project.

[1] https://github.com/gnzlbg/cargo-asm

[2] https://github.com/rustwasm/twiggy

SnowProblem 1151 days ago

Great project - this is important! Love your clear motivation statement at the top. I do wish there were some code/data examples within the README, but clearly that's not holding people back from using it.