Hacker News new | ask | show | jobs
by electroly 117 days ago
I don't know about everyone else, but slow Julia compilation continues to cause me ongoing suffering to this day. I don't think they're ever going to "fix" this. On a standard GitHub Actions Windows worker, installing the public Julia packages I use, precompiling, and compiling the sysimage takes over an hour. That's not an exaggeration. I had to juice the worker up to a custom 4x sized worker to get the wall clock time to something reasonable.

It took me days to get that build to work; doing this compilation once in CI so you don't have to do it on every machine is trickier than it sounds in Julia. The "obvious" way (install packages in Docker, run container on target machine) does not work because Julia wants to see exactly the same machine that it was precompiled on. It ends up precompiling again every time you run the container on other machines. I nearly shed a tear the first time I got Julia not to precompile everything again on a new machine.

R and Python are done in five minutes on the standard worker and it was easy; it's just the amount of time it takes to download and extract the prebuilt binaries. Do that inside a Docker container and it's portable as expected. I maintain Linux and Windows environments for the three languages and Julia causes me the most headaches, by far. I absolutely do not care about the tiny improvement in performance from compiling for my particular microarch; I would opt into prebuilt x86_64 generic binaries if Julia had them. I'm very happy to take R's and Python's prebuilt binaries.

2 comments

I am very interested in improving the user-experience around precompilation and performance, may I ask why you are creating a sysimage from scratch?

> I would opt into prebuilt x86_64 generic binaries if Julia had them

The environment varial JULIA_CPU_TARGET [1] is what you are looking for, it controls what micro-architecture Julia emits for and supports multi-versioning.

As an example Julia is built with [2]: generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)

[1] https://docs.julialang.org/en/v1/manual/environment-variable...

[2] https://github.com/JuliaCI/julia-buildkite/blob/9c9f7d324c94...

I have a monorepo full of Julia analysis scripts written by different people. I want to run them in a Docker container on ephemeral Linux EC2 instances and on user Windows workstations. I don't want to sit through precompilation of all dependencies whenever a new machine runs a particular version of the Julia project for the first time because it takes a truly remarkable amount of time. For the ephemeral Linux instances running Julia in Docker, that happens on every run. Precompiling at Docker build time doesn't help you; it precompiles everything again when you run that container on a different host computer. R and Python don't work like this; if you install everything during the Docker image build, they will not suddenly trigger a lengthy recompilation when run on a different host machine.

I am intimately familiar with JULIA_CPU_TARGET; it's part of configuring PackageCompiler and I had to spend a fair amount of time figuring it out. Mine is [0]. It's not related to what I was discussing there. I am looking for Julia to operate a package manager service like R's CRAN/Posit PPM or Python's PyPI/Conda that distributes compiled binaries for supported platforms. JuliaHub only distributes source code.

[0] generic;skylake-avx512,clone_all;cascadelake,clone_all;icelake-server,clone_all;sapphirerapids,clone_all;znver4,clone_all;znver2,clone_all

My point is if you set JULIA_CPU_TARGET during the docker build process, you will get relocatable binaries that are multi-versioned and will work on other micro-architecture? It's not just for PackageCompiler, but also for Julia's native code cache.
It worked! I was able to drop the Windows install on a standard GitHub Actions worker from 1 hour to 27 minutes. Here's what worked:

    ARG JULIA_CPU_TARGET="generic;skylake-avx512,clone_all;cascadelake,clone_all;icelake-server,clone_all;sapphirerapids,clone_all;znver4,clone_all;znver2,clone_all"
    ARG JULIA_PROJECT=[...]
    ENV JULIA_PROJECT=[...]
    RUN julia -e "using Pkg; Pkg.activate(\"[...]\"); Pkg.instantiate(); Pkg.precompile();"
What I got wrong the first time: I failed to actually export JULIA_CPU_TARGET so it would take effect in the "Pkg.precompile()" command. In reality, I hadn't correctly tested with that environment variable set at all. I was only correctly setting it when running PackageCompiler.

Thank you so much for this! It's too late for me to edit my original post, but cutting the install time in half is a major win for me. Now it only needs to precompile, not also compile a sysimage.

... Actually, this only worked for Linux. My Windows container is back to precompiling every time, again, and Windows was the slow one that I wanted to fix in the first place. I had to revert this. Back to the drawing board. I wish Julia would print some diagnostics about why it decided to precompile again.
I think I finally nailed it? I needed to update JULIA_DEPOT_PATH since I was relocating the cache files. I knew that and was already doing it, but I didn't do it right: I needed a trailing semicolon so it would have an empty entry at the end! The documentation talks about this but I didn't understand it at the time: https://docs.julialang.org/en/v1/manual/environment-variable...
That was the very first thing I tried, and I couldn't get it to work, but I'm sure I am doing something wrong. Everything seemed great at build time, and then it just precompiles again at runtime, without anything saying why it decided to do that. I'll give it another shot if you say it should be working. The PackageCompiler step is the longest part; if that can be removed, it'll make a big difference. I'd rather be wrong and have this working than the other way around :) I'll report back with what I find.
> It took me days to get that build to work; doing this compilation once in CI so you don't have to do it on every machine is trickier than it sounds in Julia

You may be interested in looking into AppBundler. Apart from the full application packaging it also offers ability to make Julia image bundles. While offering sysimage compilation option it also enables to bundle an application via compiled pkgimages which requires less RAM and is much faster to compile.