Hacker News new | ask | show | jobs
by ajtjp 2177 days ago
As a relative neophyte in Rust (have gone through about half the chapters in the Rust Book), I recently deployed a small Rust server in DigitalOcean, and was surprised by the compilation speed. The server's code was about 27 KB in size, producing a binary of about 160 KB in size. But it had 92 dependencies, including transitive dependencies, and took 5 minutes and 38 seconds to build. Which was quite impressive relative to the size of the code, even allowing for the not-super-fast CPU.

While I was watching its output, I realized that in Rust, all the dependencies are compiled; I'm used to Java+Maven or JavaScript+NPM where compiled dependencies are used instead, and that tends to be pretty quick (provided your network pipe is wide enough). I'd be curious to learn why Cargo re-compiles from scratch instead of offering pre-compiled binaries as well. I guess part of it is related to different target platforms, but it seems like if the top 5 platforms were targeted and had compiled resources available, you could reduce compile times for those platforms by a significant amount.

On the other hand, the error messages I ran into along the way were quite good at pointing me in the right direction about how to fix them, which saved more time than the extra compile time cost, relative to the Python-based alternative I had been trying to set up before that.

7 comments

It's not just target platforms, it's also build flags; you could do the default for debug and release, but set any sort of custom option and you're back to square 1.

We would also need to pay for the cost and such of building, hosting, and distributing all of that...

There are other middle grounds too. There's certainly interest, it's just not trivial. If it was, we'd do it!

Just out of curiosity, How does rust prevent the Gentoo issue of: you can compile with custom flags, but most projects have only been tested with specific flags and if you change them they will break the build, so really, no customer flags. It always starts off with custom flags and ends up ossifying to the default-only flags.
Most of the flags tell the Rust compiler what strategies to use when producing machine code. Actual Rust code rarely looks at these flags, so correct interpretation of the program generally isn't affected by them.

Contrast this with C and C++, where it's common to #ifdef in an entirely different program depending on the flags.

It's certainly possible to do something like this in Rust, but in practice it's rare.

Based on my experience with feature flags on some projects that made extensive use of them Rust is definitely also vulnerable to this issue. In the projects also not all configurations and feature combinations had been tested, and some specific configurations caused compilation to break.

Good CI configuration can definitely help to catch these issues, but it's obviously an extra effort for projects to set up. The alternative could be to simply avoid feature flags and always compile everything, which depending on the project might or might not be the better option.

I'm guessing that when the parent says "custom options" they're referring to libraries providing their own customization options via "crate features" (https://doc.rust-lang.org/cargo/reference/features.html ). Not only do libraries have an incentive to test that their own feature flags work, but also these feature flags avoid combinatorial explosion (and are easier to test) because feature flags must be additive due to how Cargo is designed: every feature flag must function the same even in the presence of any combination of the rest (flags must not be exclusive with each other).

On the other hand, I'm guessing that by "custom flags" you're referring to compiler-level flags that influence codegen, which rustc doesn't have that many of, and most of the ones that rustc does have are for controlling things that people might reasonably expect to be nontrivial work to change in the first place (e.g. linker/symbol options, cross-compilation/platform options, LLVM/gritty optimization/instrumentation options). Of the rustc codegen options that ordinary users might want to play around with, I see only two: one for turning off arithmetic overflow checks (which makes all integers act like their overflowing equivalents from the stdlib), and one for determining the general strategy of what happens when a panic occurs. The latter is unlikely to cause problems because the default behavior is a superset of the configurable behavior, and for the former any silent problems in misconfigured users would manifest as panics for everyone else, so there's a good chance the problem would be fixed upstream regardless.

I meant basically all the stuff the other three comments said :)
Have you watched Swift related announcements at WWDC? Package manager support for binary dependencies is coming with Big Sur.
Not yet.

The most valuable company in the world has significantly more resources than we do.

With hope for the presence of positive-minded folks with contribution time to spare: any suggestions of areas that could use help towards binary builds?
I would reach out to the Cargo team: https://www.rust-lang.org/governance/teams/dev-tools
Apple controlling the hardware makes it way easier though. At home, I compile Rust with CPU-specific features on about as many different hardware as Apple for their whole ecosystem!
> Apple controlling the hardware makes it way easier though.

I'm not sure that's really relevant.

Rather:

* apple has resources, lots

* apple wants to promote the use of swift, making it easier and more convenient is a good way to do that, binary dependencies reduce the complexity of the build process because you don't need to build dependencies

* apple has a vibrant ecosystem of small closed-source shops, binary-only distribution is useful for those, as well as for themselves

* promoting binary and eventually dynamically linked dependencies might mean the ability to dedup' on-system dependencies

I believe what they meant was that Apple needs to support less hardware variations. Compilers frequently have a host of flags that depend on CPU features which can potentially make the code behave slightly differently. If cargo was to implement this, they need to compile each package many times (with different configs) more than apple need to do so for Swift.

And yes of course it is given that apple's resources is vast to say the least.

> I believe what they meant was that Apple needs to support less hardware variations.

I understand that very well. My point is that they wouldn’t be doing it regardless if they didn’t want to, and if they really want to (for reasons I outlined) they have the resources to make it happen essentially regardless of the constraints or hardware breadth.

Apple also took a multi-year effort of designing a stable ABI for Swift as means to enable exactly this scenario.
The stable ABI was so Apple could ship Swift in the system. The fact that it enabled binary dependencies was more of a freebie.
Binary dependencies are really about supporting closed-source libraries. Because they’re in binary form it means there’s no ability to customize the build process, no build flags or optimization levels or anything else.

Apple also relies heavily on dynamic linking in general, so binary dependencies are likely going to dynamically link their own dependencies, thus removing a lot of the variability that would otherwise require recompilation.

If Rust is to be taken seriously as systems programming language, it needs to cater to a use case that is very important for a large set of C, C++ and Ada population.

Apple is doing this, because Swift is their next-gen systems programming language.

Swift supports static linking just fine as well.

Really, it is all a matter of which demographics Rust wants to be present.

And with Rust now being adopted by Microsoft and Google, I just see this need only increasing.

While I agree with your statement, the lack of binary linking has been a blessing in Go and Rust. The inability to give a binary "SDK" forces many companies to provide source (and in many cases open-source their library). I would find it very irritating if I can't navigate into the library source at least during debugging.
Go tooling supports binary dependencies, where the only source provided by the packages is the documentation for go doc.

It just looks like everything is source code when not taking the effort to read through all dependencies.

It doesn't force companies at all, only those that are comfortable shipping source libraries end up adopting such languages.

I used to work for a company that shipped encrypted Tcl source code and provided the necessary interpreter hooks to access the code in its encrypted form.

> I'd be curious to learn why Cargo re-compiles from scratch instead of offering pre-compiled binaries as well. I guess part of it is related to different target platforms

There’s that but there’s also issues like ABI stability (and the lack thereof), compilation flags, hosting (infrastructure and its cost), distribution mechanisms, …

There’s an issue on cargo dating back to 2015 (#1139) but there’s a lot of efforts needed to think about the problem, then actually solve it.

That is the approach taken by most compiled languages, specially in commercial environments.

I think it is only a matter of time until such cargo gets its "Maven".

Incidently one of Swift announcements at WWDC was the support of binary packages in Swift Package Manager.

With Rust you get safety at compile time while avoiding memory safety issues at runtime. The compile times are objectively longer compared to most other languages. But whether this trade-off is worth it for you depends a lot on your company's CI pipeline and the skill of the developers.
I'm not sure safety has anything to do with long compile times, can you enlighten?
There’s a lot happening at compile time that less safe languages don’t do. Type checking and type erasure, etc
As the article explains, this all is a relatively minor contribution to compile times.
Ada/SPARK does it.
IIRC, languages like Java and Go compile quite fast, and they are even safer than Rust...
I feel like the long compilation times is going to be a show stopper for smaller teams/companies where you can't afford the loss in productivity.
Anecdotally I can tell you it does not amount to a loss in productivity. Cargo caches built deps so the cost is really only paid once in a while and the advantages of rusts type system are a boon to productivity.
If productivity is important, you really can't afford memory corruption bugs. Those can easily take weeks to hunt down.
It can be slightly annoying -- compared to f.ex. OCaml with its lightning-fast compiler -- but in practice incremental recompilation is much faster than a compilation from scratch so it rarely irks me that badly.
Just to add to other comments: Cargo feature flags are another reason why you need to build your dependencies.

Of course, if Cargo supported pre-compiled dependencies, I am guessing that it would be smart enough to only recompile the dependencies that are using non-default features.

In addition to just feature flags the fact that a lot of Rust code is generic and would only be compiled into binary form in the final application would likely prevent excessive use of precompiled artifacts.
> it had 92 dependencies, including transitive dependencies, and took 5 minutes and 38 seconds to build.

What is it that triggers such a "full" build though? Obviously in some CI scenarios you might start from scratch, but in an edit/compile cycle, you'd never be hitting those 5 minutes, correct?

Typically in an edit/compile cycle you'd only compile your dependencies once for each build type (check, debug, release). Unless you change a dependency's feature flag, the compiler will just re-use what's already been compiled.

If you clean your build folder, things will need to be rebuilt from scratch. Likewise if you change your compiler version.

It seems a CI system should be able to work the same way and cache the dependencies just like a local build.

If it does, then the long compile times are almost never encountered for neither developers nor CI. So are they really problematic?

Go would have compiled an order of magnitude faster still. The compiler is less safe, and thorough than Rust's, but still.
You’re right, though after an initial compilation cargo does a good job caching and subsequent builds will often beat Go’s compile times.
The backend service I'm using doesn't have cached CI yet, so the entire build takes ~20 seconds. The CircleCI free tier is good enough for now, and more time is spent pulling dependencies than building, anyway. The build pulls in heaps of dependencies, and the main codebase itself being around 20kloc. Subsequent local builds take around .2 seconds locally, so I'm very happy with it. Even working on a tiny Java/ Kotlin codebase recently made me miss the good compile times!

How hard is the caching to set up, especially in a CI setting?

It depends - at my work we use buildkite which is a “bring your own runners” build service, so caching is available by default, so long as the cache is outside of the project directory. For personal stuff I use GitHub and Gitlab - they both offer caching. GitHub’s offering is easier but Gitlabs is just as effective.

Cargo caches by default and you can specify the path that it stores artifacts. Most of the cache story is about what your chosen provider uses to specify build steps, etc

Use sccache and an S3 bucket