Hacker News new | ask | show | jobs
by kornish 3218 days ago
Favorite quote:

> There’s a wealth of tutorials, courses, books, and the like about how to write compilers. But if somebody believes that writing a transpiler isn’t fundamentally the same thing as writing a compiler, it may not occur to them to look at any of that material.

The basic argument is this: "compiler" isn't a term that needs to be limited from transforming a high-level input to a low-level output. Any program which is structured to map an AST to another AST, no matter the relative levels of abstraction, can borrow from many of the principles of modern compiler theory, including using many small passes rather than monolithic rewrites.

At a company I worked for, we compiled a high-level declarative expression language of our own devising into SQL and other back-end representations, including English. Is that a "compiler" as many see it? No. However, thinking about the problem like a compiler problem gave us lots of insights into how to architect our product; it let us draw on an existing wealth of knowledge to make improvements quickly and reliably.

5 comments

I'm curious what are the canonical references which state that a compiler does high-level to low-level transformations?
Googling "what is a compiler" returns this: "a program that converts instructions into a machine-code or lower-level form so that they can be read and executed by a computer." So, that's something.

It's hard to get more canonical than the Dragon Book. The Dragon Book (2nd Ed.) says this in section 1.2: "Up to this point we have treated a compiler as a single box that maps a source program into a semantically equivalent target program."

So, the most canonical source doesn't include an explicit mention of high-level to low-level (though there may be sources I'm missing). But in my experience, that's definitely the connotation. Otherwise, the term transpiler, which is connoted with not outputting low-level code, never would have arisen.

The Dragon book does not make a distinction between high-level vs low-level output because fundamentally there is none. IMHO it is quite a stretch to claim otherwise because the questionable term "transpiler" exists.
I apologize for not being clearer – I'm not claiming that they're fundamentally different. My original comment, to that point, agrees with the original article's author: compilers are just mappings to and from programs.

re: transpiler – all I'm saying is that the term arose because people associated compilers with low-level outputs.

In summary, there don't appear to be canonical definitions of compilers as having low-level outputs, but for some reason many speak of them that way.

There's also the etymology, ie. the pre-computing dictionary definition: you compile eg. a list, ie. make something smaller/shorter from a larger input. You also write a book when it's an original work, but another author or editor might take parts of yours and other books and compile an anthology. You might translate a book from one language to another, but that's not considered a compilation.
I've always been under the impression that compile means "put together" rather than "compress".
Webster lists in part:

Compiler: one who compiles - first use 14th century

And for compile:

transitive verb

1 : to compose out of materials from other documents

2 : to collect and edit into a volume

3 : to build up gradually <compiled a record of four wins and two losses>

Origin: Middle English, from Anglo-French compiler, from Latin compilare to plunder.

Synonyms: anthologize, collect

Note, I've skipped the dictionary references to computer use, as the seem overly (wrongly) focused on "top down" compilation...

Agreed. I had the same thing writing the parser for my $SHELL. The output was never going to be machine code since the bulk of the code would consist of pipelining external processes and spawning subshells. So the output of "compiler" is an AST-like memory structure with an order of process and tokens for parameters. However I still found following tutorials about compiler design immensely helpful since the problems I faced were largely the same even though the output generated was vastly different.
> At a company I worked for, we compiled a high-level declarative expression language of our own devising into SQL and other back-end representations, including English. Is that a "compiler" as many see it? No. However, thinking about the problem like a compiler problem gave us lots of insights into how to architect our product

Wow. Snap. Except we never got it to compile to English (well, we probably could have but the result would have had deeply nested bracketed clauses...)

> But if somebody believes that writing a transpiler isn’t fundamentally the same thing as writing a compiler,

I'd be surprised if anyone did.

The use of transpiler is more about audience expectation, a specificity.

It's shorter than writing "source-to-source compiler", and acknowledges compiler as its superset, right there in its name

That's unfortunately not my experience. I'm appalled every time someone tells me "But Scala.js is not a compiler, it's a transpiler, since it compiles to JS!"

I assume other language users and authors suffer the same kind of comments on a regular basis.

I'm now horrified to learn that you experience is the norm.

And that the cognitive dissonance to say:

> ...is not a compiler, it's a transpiler, since it compiles to...

is alive and well.

Mea culpa.

Welcome to the webdev world. It keeps inventing new terms for things that already have established terms for decades, probably because it's rare anyone bothers to look back at the decades of stuff not done in JS...
I'd argue that it's a compiler if it treats JS as a lower language - that the output is simply an intermediate artefact for executing Scala on a JS engine, and not meant for human consumption. If, on the other hand, the output is meant to be developed further by hand, if the output is considered an equivalent and not lower representation, then it's a transpiler.
> If, on the other hand, the output is meant to be developed further by hand, if the output is considered an equivalent and not lower representation, then it's a transpiler.

Which is also still a compiler.

I would argue that it isn't, if it is, then what's the distinction between a compiler and a transpiler?
If all birds are dinosaurs, then what's the distinction between a dinosaur and a bird? Well, there used to be dinosaurs who weren't birds.

Likewise, there are compilers I wouldn't call transpilers.

> It's shorter than writing "source-to-source compiler"

Brevity isn't everything. But more importantly, "transpiler" (like "source-to-source compiler") does not say what you are compiling from, and what you are compiling to. In order for the term "transpiler" to be useful, you need to specify those things. Anything you think you imply by using the term is not, in fact, implied.

If you compare the lengths of "JavaScript to Pascal compiler" and "JavaScript to Pascal transpiler", you might be in for a surprise!

I think the current definition of "transpiler", the way it's used in the wild, means "compiles $something to JavaScript". I haven't seen anyone using this word for anything else than compiling to JS.
It may lack some specificity, but it makes you look for the terms for "from" and "to".

It eliminates bytecode compilers and native compilers from the conversation immediately.

> It's not a native compiler, it's a transpiler

If this was said, you wouldn't then ask if it compiled for a VM, or if it could directly produce small binaries.

I wouldn't say the term is completely redundant. Only when you introduce it for the first time.

E.g.

> That's not possible. It's a transpiler. Back to the topic at hand...

You got me thinking. Can a human language like English be defined with an AST? If so, are there examples? If not, why not? I suspect the answer might be, "yes, it's called [this thing I've heard of a thousand times but never considered it to be a language compiler]"
I've heard of sentence diagrams[0] which are tree-like structures used for checking grammatical correctness.

[0] https://en.m.wikipedia.org/wiki/Sentence_diagram