Hacker News new | ask | show | jobs
by closure 3596 days ago
I do wish people would stop using the term transpiler as if there is something novel or different about source-to-source translation.

If you're taking a higher level language and compiling it to assembly, it's a compiler.

If you're taking a higher level language and compiling it to LLVM IR, it's a compiler.

If you're taking a higher level language, and compiling it to another higher level language...it's a compiler.

I am not sure where this term was first coined, but we've been doing source-to-source compilation for a very long term and it's just within the last 7-10 years I start hearing people use this term, which IMO adds no actual value.

It's perfectly fine to say "Swift to JS compiler".

9 comments

> it's just within the last 7-10 years I start hearing people use this term

If I recall, the term arose out of the JS community around the time CoffeeScript, Traceur and other languages that compile to JS got popular. Brendan Eich did a lot of popularization of the term.

> IMO adds no actual value.

I think it's a really useful term. Compilers and transpilers are structurally different and have different high level aims.

If you tell me you are writing a language that compiles to JS, then my assumption is that you treat JS as effectively an "assembly language for the web". You will output whatever bizarre JS code that happens to correctly implement your language's semantics efficiently and compactly. It might make my eyes bleed, but that's OK. I'm not supposed to read it, maintain it, or debug it. Think Emscripten or dart2js.

If you tell me you are writing a language that transpiles to JS, then my assumption is that you treat the output JS as a first-class developer artifact. The generated output should, as much as possible, match the structure, formatting, and naming of the original code. I should be able to read it and step through it in my debugger. If the source language's maintainers disappear, I can check in the compiled output and move on with my life using it as vanilla JS. Think CoffeeScript or TypeScript.

Those are very very different kinds of tools, and it's handy to have words to distinguish them.

I agree that in "Swift to JavaScript transpiler" the word would be replaced with compiler and nothing would be lost; however, by your own definitions you can see that 'transpiler' is more specific than 'compiler.' Someone out there saw a need for the word, just like someone saw a need for coupe and sedan.

You shouldn't worry until academia starts using it. Oops. Here's an article with 'transpiler' on the title: http://dl.acm.org/citation.cfm?id=2173694

It's a good thing that English is descriptive and not prescriptive.

I don't think it's a meaningful term because there is no reasonable definition, specifically of where you draw the line.

cfront compiled C++ to C, so it's a transpiler, right? Or is that just an unimportant implementation detail?

Everything targeting LLVM IR is a transpiler, because it's just source-to-source, with the second source being LLVM IR.

Similarly things targeting the JVM or CLR - all transpilers because both of these are relatively high level targets. They might not be something you would consider a "source language", but for many people assembly language is a perfectly reasonable source language, and thus...

Everything compiling to assembly language is a transpiler. The target language is assembly.

Having worked on compilers for over two decades it just comes across as a made-up term by people who don't understand that what a compiler does is translate from one language to another (be it another "source language", ASTs, linear IRs, VM bytecodes, assembly languages, or executable code).

I don't think I'm being particularly pedantic here. Compilers simply generate another form that one person or another might consider a source language (in fact people working on JITs most definitely think of their input as the source language).

Yes, cfront could be considered a transpiler, but the term wasn't coined back then.

IRs, ASTs and bytecodes are not programming languages. If you insist otherwise because "some might consider it that way", you have no business criticizing the use of words, to be honest.

There are textual representations for these things, but even then people rarely program in those (LISP being an exception). Assembly language is a textual representation for humans to read and sometimes program in. If you have some program that actually translates to assembly language as a target, by all means, call it a transpiler if you want.

> IRs, ASTs and bytecodes are not programming languages. If you insist otherwise because "some might consider it that way", you have no business criticizing the use of words, to be honest.

That's not what I am saying. The point is that compilers make many transformations from "language" to "language", it just happens that other "source languages" are sometimes the target.

> If you have some program that actually translates to assembly language as a target, by all means, call it a transpiler if you want.

For a very long time that was the model most compilers used, and in fact people (mostly those espousing to be advocates of the UNIX philosophy of building small tools that could be chained together) would rail against compilers that directly generated machine code rather than generating assembly that was run through a separate assembler.

So if a compiler that generates assembly source is a transpiler, then the by definition most compilers are transpilers (and all transpilers are compilers which people seem to agree with).

I agree with munificent. ‘Transpiler’ implies very similar semantics between the source and target language; ‘compiler’ implies fairly different semantics (where a translation between the source and target is not obvious).
Some people treat C as portable assembly.

Does that make gcc a transpiler for converting portable assembly to platform-specific assembly due to their similar semantics?

I knew someone would bring this up.¹

GCC is an optimizing compiler, at which point all bets are off for obvious/trivial translation between C and asm. You could, hypothetically, write a optimizing TypeScript compiler that emits asm.js.

However, while you can reasonably argue C started as a portable assembly, significant differences like a type system, pointer arithmetic, a calling convention, etc, exist now. I don't think it's unreasonable to say C is significantly higher-level than assembly.

1. My parenthetical at the end was anticipating this—then again, the translation between C and asm frequently is easy, albeit tedious.

I guess you could replace gcc with a dumb compiler / turn off optimizations. And throw in compiler intrinsics while we're at it, after all people write hot loops with them. And pointer arithmetics actually translate nicely to all those fancy addressing modes you find on x86.

And in the other direction one could argue that those so-called "transpilers" actually compile from a higher-level language when you consider complex type-inference and static type checking. Optimizing is not all a compiler does.

I don't buy this definition at all.

First, the "semantics" had better not change, either by an optimizing or non-optimizing compiler, or you've got a miscompile.

Second, I've seen several source-to-source compilers (that the proponents of the term transpiler would certainly call transpilers) that do non-trivial transformation, e.g. to try to capture certain idioms, or that are required as a result of an "impedance mismatch" between the languages.

> I am not sure where this term was first coined

The term has been in use in academic literature since at least the early 1960s.

http://comjnl.oxfordjournals.org/content/7/1/28.full.pdf+htm...

> we've been doing source-to-source compilation for a very long term

Yes and we've been calling them transpilers all along!

I'm a programming language researcher and I think the term is very useful to distinguish between different kinds of compilers.

Sorry but can you point to the particular use of "transpiler"?

I've skimmed this several times and have now run it through a PDF OCR program, and am not finding it.

Even if this paper does use the term saying "we've been using it all along!" is hardly accurate because this has hardly been a well-known term until very recently - I've read hundreds of academic compilers (mostly on the optimization & code generation side of things) going back to the late fifties, and it's only a term I've seen used in the very recent past.

The second to last paragraph. And they use the term in quotes which makes it appear like it could have been new at that time.

Transpiler means high-level to high-level translation. Compilers which implement such a translation have properties and challenges in common. It's a subset of compiler. I don't know how anyone can think it's wrong to have a term, whether it's new or not, to refer to a useful subset like that.

What harm or misunderstanding could it possibly cause?

Thanks for pointing out the reference - the OCR apparently stopped part-way through the document.

The harm comes from having a definition that is so loose as to be meaningless. You're now saying "high-level to high-level", but most people use the terms "source language to source language", and several people here have admitted that a compiler that generates assembly should be considered a transpiler.

If you're going to define it as something that goes from a "high-level" language to another, then you need to define high level. Is C high level? is C++? is HLSL (there are lots of HLSL <-> GLSL translators)?

Also, in almost every context that I see "transpiler" used, people say that it's a transpiler from X-to-Y, which adds zero value above saying it's a compiler from X-to-Y.

People usually don't feel the need to qualify what they are compiling from or to, despite the fact that many compilers target (or use as a source) many kinds of languages and/or intermediate representations.

The overall point here is that compilers (of all kinds) are simply translators - some from a high level source language to one form of (textual or otherwise) IR, some from one form of IR to another, some from one source language to another (be it high level or not), etc. Saying one is a "transpiler" and others are not, especially when you give adequate context and say what the source and destination forms are, just doesn't add value.

Calling it a compiler is certainly accurate but transpiler seems more appropriate in this context.
Yep saying transpiler is just being more specific, like saying car instead of automobile - although it is probably superfluous when including the names of both input and output languages.
X86 to Microcode "Transpiler"

Microcode to Electron "Transpiler"

Transpiler vs compiler is a useful distinction.

If you could just as well program directly in the target output/language for the problem domain, then it's a transpiler.

By that definition every compiler that generates assembly language is a transpiler, rendering the term meaningless.

Perhaps you don't think you could "just as well" program in assembly language, but there are many many people who do, perhaps not in your own problem domain.

>Perhaps you don't think you could "just as well" program in assembly language, but there are many many people who do, perhaps not in your own problem domain.

I don't think it matters if there are "many people who do" -- as long as they are still a small minority compared to those who don't. After all you can find people believing everything, I'm sure some are even writing web apps in Assembly.

That said, C to Assembly (as opposed to binary executive) could be said to be transpiling too -- from "portable assembly to assembly".

So the number of people doing something is how you determine the definition of a word?

That really makes no sense.

Err, number of people agreeing on a specific meaning is exactly what determines the definitions of a word.

Same for usage of things.

There were always be outliers for whom a thing is better used for Y rather than X, but in the end is what the majority sees the tool as useful for that determines how it's defined (in casual use, dictionaries etc).

Language can change. You gave three distinct concepts that all map to the same word. That alone shows why that change may be beneficial.