Hacker News new | ask | show | jobs
by skitter 1280 days ago
I have asked this question before, but why write an entirely new frontend, which is an enormous task if you want to reach a similar quality to rustc? rustc_codegen_gcc¹ adds gcc as a backend to rustc alongside llvm, miri and (wip) cranelift. As a result, it always works with the newest version of rust and is already nearly complete after less work.

¹https://github.com/rust-lang/rust/tree/master/compiler/rustc...

6 comments

FSF wants to support Rust as a first-class language in GCC, which means their own implementation and ability to bootstrap without dependence on other projects.

Having two independent implementations is good for finding code depending on compiler bugs. The development will also highlight where the Rust language is not documented/specified enough yet.

The GCC implementation may end up with a different design, and perhaps faster compile times, or at least we'll know whether Rust is inherently slow to compile, or is that just rustc. Rustc supporting multiple back-ends can't have as close integration with GCC's back-end as GCC's own front-end.

> Having two independent implementations is good for finding code depending on compiler bugs.

I'd go as far as to claim that having multiple independent implementations is a basic requirement for any programming language which has any aspirations of being production-ready.

Thats just silly, there are a lot of production ready languages that only have one full implementation such as Typescript, Elixir etc
It would be pretty hard for a second implementation of TypeScript to gain any share of the market since Microsoft refuses to update the TypeScript specification. They point to tsc and say that is the specification. It is completely unprofessional in my opinion.
> (...) such as Typescript

It seems you're oblivious to the fact that projects such as swc[1] exist, and have been adopted by projects such as deno.

[1] https://swc.rs/

> Elixir etc

I'm not familiar with Elixir nor am I in a googling mood. Nevertheless that was an awfully short list. Why is that?

swc only does transpilation, it does not support TypeScript's type-checking which is its main feature.
The author of swc is currently working on a Rust-port of the type checker: https://github.com/dudykr/stc

Here is an interview with them: https://www.totaltypescript.com/rewriting-typescript-in-rust

Counterexample: the fact that there is only a single official go compiler is great advantage IMO since it enables a lot of cool tooling and great new features are introduced quickly in the official implementation.
> since it enables a lot of cool tooling

What leads you to believe that the lack of alternatives leads to enabling "a lot of cool tooling"?

> and great new features are introduced quickly in the official implementation.

That has zero to do with existing only a single implementation, and everything to do with having developers working on unstable releases and a poor job thinking things though by writing stuff down both as proposals and specifications.

If there is a specification that was discussed and thought-through by competent individuals, nothing stops anyone in the world from reading the specs and implementing the same features.

> What leads you to believe that the lack of alternatives leads to enabling "a lot of cool tooling"?

Rust projects have settled on Cargo, which gives them uniform way of building, testing, getting dependencies, docs generation (https://docs.rs), IDE support, etc. (https://lib.rs/development-tools/cargo-plugins)

Contrast this with C which has several compilers and plenty of build systems to choose from, often more flexible and advanced than Cargo, but the fragmentation means that C projects are snowflakes, and tooling needs to be configured for each project and compiler combo. For every build system X there's someone who says it sucks and you should be using Y instead.

However, I don't think GCC will cause similar fragmentation problems for Rust, because there are already 100K rustc+Cargo packages in existence, so it has no choice but to follow and be compatible with them.

Cargo is not universal. Many projects use rustc directly with an alternative build system such as bazel, make, or meson as they are in multilingual repos. The fact these projects exist doesn't affect your experience with rust. The fact a gcc tool chain for rust will exist should similarly not affect your experience.
Doesn’t Go also have a GCC version?
It does. GCCGO is often used for bootstrapping golang, and is sometimes used as a full alternative compiler for architectures that Google refuses to consider supporting.
These days lots of languages are single-team efforts, because producing high-quality production programming languages is an incredibly large amount of (difficult) work. It is really, really difficult, it takes many years, you constantly get shit on by people for everything, and if you're lucky you might get a tiny bit of money after a while. It isn't just coincidental complexity; programmers are actually incredibly demanding users with many needs. And so it isn't easy to find the money and people to perform multiple implementations, outside of very particular cases. Even the ratio of C/C++ programmers to high quality implementations that see significant production use is basically astronomical. And the unfortunate news is that alternative implementations don't often contribute very much to the overall production-language effort if they aren't of high quality or set out to achieve particular goals.

I used to work on the Haskell compiler GHC. Haskell has had many alternative implementations over the years. But they were often nowhere near the level of engineering quality of GHC, and nobody used them for nearly anything, unless they were graduate students exploring surface and language semantics. But many of them just used GHC instead, too. It wasn't for lack of trying either; lots of people wanted the standard for the language to work, I used to be on the standards committee for the language. So they were useful to some extent design space. But there are many, many, many other important concerns for a "production" programming language; robust feature support, good platform support, high quality code output, robust testing regime and code review, good library and API design, high quality debugging support, package management, editor support, all kinds of stuff. These also dovetail together; certain efforts at the semantic level are nice but if they don't have a good story for debugging or something, they might not make the cut.

Really this seems like one of those requirements that's always impossible to satisfy in some sense, a kind of catch-22. If you're one of the 800lb gorilla programming languages (Java, C++, Python), you might have one or two "production" implementations and if you're lucky the ratio of usage is better than 80/20, but they exist and therefore you are a "production-ready" language. Nobody else has the money to staff engineers for multiple alternative languages (OCaml, Elixir, TypeScript, Go) despite them funding billions of dollars worth of software, so they still aren't "professional" and therefore do not qualify. I get it, and it's not for zero merit, but I think it's mostly just a cat-and-mouse game, at the end of the day.

gcc-rs existing of course has many other benefits beyond this; people know GCC so they can now have a free Rust compiler for their CPU ports, etc. This is still good. But really it's still the exception. I do think this is a good effort but I consider this sort of thing one of those basic folklore issues that's actually much more nuanced in practice.

The number of people who use those alternative implementations -- or even the quality of their output with respect to performance or platform support -- don't really matter: the reason to insist on (at least, and as unrelated as possible) two separate implementations of something is to ensure that the behavior of the language isn't merely an accident of incorrect code in the compiler instead of an intentional choice by the language designers, as it shows that two different teams came to the same result (at which point hopefully it is noticed if there are spec issues and those are either corrected in the parent compiler or "bug for bug" duplicated in the child compiler after documenting the issue in the spec). Until you have done this, it is difficult to even claim what the language even is.
That's an antiquated notion from the times when programming language implementations were 99% propriety, so you needed some ability to "shop around".

Any time you hit a fork button on an Open Source programming language's github page, you get your own independent implementation. Hire a contractor and introduce any "extensions" you need.

Because single implementation languages are toys. Any real language will have a multiplicity of implementations for different purposes. GCC has a ton of development resources behind it; processor makers are familiar with it and often turn to it to bring up new processors and ISAs, etc. Having a GCC front end is a big step in Rust becoming more popular as a systems language.
This simply isn't true. Only C/C++ people have ever cared about having multiple implementations and have to cling to the catastrophe that those two standards are because of it.
Somehow Python people, Java people, Ruby people, JavaScript people managed to produce multiple independent high-quality implementations.
On top of that, there so many lisp implementations, it's not even funny anymore.
There's a difference between having multiple implementations, even high quality ones, and caring about it. Python is all about the reference implementation, as much as I wish that people cared about pypy and the JS community has always been openly hostile to SpiderMonkey compatibility and only cares about V8 in my experience. And Java is all about different forks of the same jdk, even if several actually independent different ones used to exist.

I will admit to not being familiar enough with Ruby to say anything about that

python has 1.5 implementations pypy is close, but not that close to being a complete implementation.
Jython is pretty complete though, and IronPython is not completely dead.

Micropython can be counted as 0.25 or something :)

Oh, “only C/C++ people.” I.e. the people who wrote almost all the systems code in the last 40 years. Just those people.
> Because single implementation languages are toys. Any real language will have a multiplicity of implementations for different purposes.

I think that's what the parent was replying to, not somehow claiming C/C++ people are insignificant.

The catastrophe was that the C/C++ specification left too many things open for interpretation. Regarding "Only C/C++ people": there is at least one more language (Go) that has two implementations (the gc compiler and a gcc backend).
I didn't say only C/C++ had several implementations, I said that only the C/C++ communities care about their alternate implementations. The python community looks at pypy like a weird novelty and doesn't use it. The JavaScript community is often actively hostile to SpiderMonkey (firefox) and JavaScriptCore (Safari). As far as I've seen, GccGo is mostly ignored.
> This simply isn't true. Only C/C++ people have ever cared about having multiple implementations and have to cling to the catastrophe that those two standards are because of it.

This comment is so blatantly wrong and detached from reality that it makes me wonder if it's just a good old fashioned troll.

You'd be hard-pressed to find any production programming language which does not have multiple independent implementations.

> Because single implementation languages are toys

‶Toys″ like Go, OCaml, arguably Ruby & PHP, Perl, Erlang, Kotlin, ...

Go has a GCC implementation (which I use).

PHP has alternative VMs and specialized servers for serving it fast and in encrypted manner (they are closed sourced), tho.

Kotlin runs on JVM, which has at least three fully compliant implementations.

Erlang is a specialized language and telecommunications platform. It’s something different.

> PHP has alternative VMs and specialized servers for serving it fast and in encrypted manner (they are closed sourced), tho.

Well, if they are not public, it does not really have any influence on language specification.

> Kotlin runs on JVM, which has at least three fully compliant implementations.

And Rust runs on x86 which has myriad of implementations; still, we're talking about the language compiler, not the underlying runtime.

> Erlang is a specialized language and telecommunications platform. It’s something different.

How so? It's a language built over a runtime, just like Java. I don't see how the specifics of the runtime have to do with anything.

For PHP, there is HVVM which, AFAICT, can run vanilla PHP and is open.

TruffleRuby is a thing, as is JRuby.

Implementing Erlang without implementing its VM and the whole infrastructure, which is a much larger task than merely a VM like Python's.

HHVM dropped vanilla-PHP compatibility after PHP7 started coming close to it in performance (which was the top reason anyone used HHVM to serve vanilla-PHP code) - now they’re focussing on Hacklang which, no longer shackled by the need to be bug-compatible with PHP’s awful design decisions, is free to become a better language
Also artichoke ruby https://www.artichokeruby.org/
Kotlin is a prime example. You have to download the vendor implementation. There's no way to bootstrap. It's horrible from a tooling perspective.

Any single-implementation language inevitably degrades to the point where the implementation becomes the specification. Shortly afterwards, people can only use the language if their tool chain, operating system, etc. closely match the understanding of the language authors. That's a bad situation to be in.

I don't see how the bootstrapping problem is linked to multiple implementations.

> Shortly afterwards, people can only use the language if their tool chain, operating system, etc. closely match the understanding of the language authors.

Perl, well-known for being an iffy language to install and running on a restricted set of OS/architectures.

Honestly curious, I know it exists, just like there used to be GCJ, but have you seen it being actually used?
Multiple implementations brought only trouble and frustration to the masses and endless opportunities for companies to exploit feature lock-ins (borland anyone?) . If anything drives people mad more than anythong are gcc vs clang vs msvc vs this or that issues. Just the whole complex numbers or quad precision issue is the best example from math handling. It is basically very unsuccessful stakeholder management to say the least. One of the main reasons for us to switch to Rust is not having multiple implementations.
I've wondered about this for ages and now know: the people who work on GCCRS wouldn't work on rustc_codegen_gcc for one reason or another (familiarity, culture, personal ideals...?). So GCCRS is not "eating away" at available bandwidth and there's no reason not to let GCCRS developers do their thing, even if rustc_codegen_gcc is a more straightforward way of achieving most goals
It's probably not hugely useful from a "I want to compile this Rust code" point of view but I imagine it will at least help iron out ambiguities and bugs in the various specs people are working on. I think there's at least MiniRust and the Ferrocene Language Specification:

https://www.youtube.com/watch?v=eFpHadbv34I

https://spec.ferrocene.dev/

Why not do it? There are lots of reasons to have multiple implementations of a language, one of them being gcc is much easier to bootstrap than rustc
rustc got bootstrapped already, so download it. If you want to run the compiler itself on a new architecture, cross-compile. If you still decide to bootstrap it again, that's something that will only need to be done once. And even in that case, you can use the second implementation that already exists, mrustc.
The same reasons given[1] for "why write clang when gcc works fine", mostly? Multiple implementations of a language force rigor in the specification[2], find edge cases in implementations, and promote healthy competition in performance.

[1] The good reasons, anyway. Please no GPL screaming.

[2] Something that IMHO rust has historically been kinda bad at. Even now there remains no clear specification I can find that explains exactly what behavior the borrow checker will admit vs. reject.

> The good reasons, anyway. Please no GPL screaming.

How is “we want a toolchain we can use as an library from non-GPL code” not a good reason?

The ability to link against LLVM and Clang has driven a massive advance in languages and tooling.