Hacker News new | ask | show | jobs
by beaub 1050 days ago
Compilers are in this weird spot where they are really mathematically defined programs (which OCaml excels at implementing), while also having high runtime efficiency as a requirement (the reason why C/C++ are such prominent languages for compilers).

With such requirements, I think a point that is fair to make is that Rust acts as a great middle-ground. It avoids the cost of automatic memory management and provides low-level control while also having a more powerful type system and a more "functional" style.

Brushing off the actual efficiency of the produced binary seems like a huge oversight when dealing with a compiler.

3 comments

I am not sure that the runtime efficiency of the compiler binary is that important. People like fast compile times, but that is more to do with language design than the choice of language for the compiler.

You could write a compiler for Pascal in Python or another very slow language and it would be faster than a Rust or C++ compiler written in Rust or C++. That is because those languages have designs that make compilation algorithmically slow, while Pascal was designed to be fast to compile.

Almost every compiler is fast on toy-sized programs. E.g. the standard Java compiler is pretty fast, and uses little resources.

It becomes visible when you build a large project: you notice that when you face 100k LOCs, efficiency of every compiler's part starts to matter, and RAM usage may grow to uncomfortable levels if your compiler does not care enough.

Yes, some compilers are slow on large programs because they don't scale well. Others aren't, because they do. That's what I said.
>People like fast compile times, but that is more to do with language design than the choice of language for the compiler.

People like fast compile times and people either like to use (or are forced to use) languages that are inherently slow to compile. That's exactly why compiler performance is absolutely critical.

> I am not sure that the runtime efficiency of the compiler binary is that important.

If the compiler is for JIT, then efficiency will be important.

The cost of automatic memory management is latency and increased memory usage. In a soft real-time system like a game, the garbage collector may cause lag spikes so you miss the head-shot on your opponent. You also require at least 50% more memory for efficient automatic memory management. Throughput, however, is not one of the costs. You can in fact achieve higher throughput with automatic memory management than with manual memory management.
>while also having high runtime efficiency as a requirement (the reason why C/C++ are such prominent languages for compilers

I'd want to believe that compiler engineers really put effort into compilers performance, but I just don't buy it.

LLVM, GCC, MSVC, etc, etc all of them touch C/C++ and are slow as hell

For compilers written in other languages I'd say that still LLVM is the bottleneck

>It avoids the cost of automatic memory management and provides low-level control.

What "low-level control" do you need? It is not firmware development.

Btw: Microsoft rewrote their C# compiler from C++ to C#.

Any compiler that gets used at runtime (branded JIT, usually) ends up growing performance hacks or being written from scratch to run quickly. Javascript is prone to using multiple compilers based on how frequently code was executed. That's also what the whole -O0 -O3 -flto -thin-lto -pgo etc flags are about, granting permissions to burn different amounts of time during compilation.

It's really easy to accidentally write code that walks off a performance cliff on unexpected input, but that's likely to get hacked around if someone reports it as slow compilers do annoy people.

Ehh, I don't think we were talking about just JITs.

JITs are just some part of what's under the general "compilers" term

Did Microsoft rewrite their C++ compiler in C#?
They did in the past, in something like LLVM, but based on MSIL, it was called Phoenix done by Microsoft Research.

In any case, it wouldn't make sense from having a bootstraped compiler point of view.