Hacker News new | ask | show | jobs
by jermaustin1 2165 days ago
Instead of outputting c, could you not just output the equivalent assembler?

so instead of

    self.emitter.emitLine("printf(\"" + self.curToken.text + "\\n\");")
you do something like

        self.emitter.emitLine("STRING DB " + self.curToken.text + "', '$'")
    ...
    self.emitter.emitLine("LEA DX,STRING")
    self.emitter.emitLine("MOV AH,09H")
    self.emitter.emitLine("INT 21H")
2 comments

Well, what you just posted already highlights one difficulty: when you come to the PRINT statement, you have to emit the string and the instructions to two different places, so we're already talking about having two different emitters, or some other way of handling this. And we need to generate different non-conflicting names/addresses depending on architecture.

You said in your other post that this can be done with minor modifications, but I can already foresee a few modifications that would need to be made which aren't minor.

And then there's the problem that you may want to target more than one architecture. We can write two completely different code generators, but it would be nice if there were an architecture that could share some of the code.

I honestly can't tell if you are just trolling now, and I'm falling for it, but you seem to think this 2000 word set of tutorial on the basics of compiler design (lexing, parsing, emitting) is supposed to be the one and only document you will ever need to create the next C++.

> You said in your other post that this can be done with minor modifications

And it probably can, depending on the flavor of assembly you want to use, there are dozens (hundreds?) of them, i'm sure some will allow you to inline the string declaration. The example I gave probably doesn't even work since I haven't programmed in 8086 in close to 20 years, and I don't even remember how to set up data blocks and code blocks in it any more.

> And then there's the problem that you may want to target more than one architecture.

This is a toy compiler written by a professor of computer science meant to teach you the basics of building a compiler (lexing, parsing, emitting). This isn't a tutorial on building the next GCC.

> I honestly can't tell if you are just trolling now, and I'm falling for it, but you seem to think this 2000 word set of tutorial on the basics of compiler design (lexing, parsing, emitting) is supposed to be the one and only document you will ever need to create the next C++.

That's a fair criticism.

I'm frustrated with the lack of material on emitting assembly, but it wasn't right of me to take that out on the author of this post. I apologized in a different post.

> And it probably can, depending on the flavor of assembly you want to use, there are dozens (hundreds?) of them

How about one I can run on my machine? There are maybe 5 that are useful targets I can think of:

   * x86 or ARM (depending your machine)
   * LLVM
   * GCC RTL
   * Web assembly
   * Parrot? Maybe the JVM has some low-level bytecode?
there's certainly equivalent assembler, but depending on the architecture you're targeting this can be a pretty monumental task. Even a single function call can be north of a hundred lines or something (i'm making this up :D )

I guess that's why we have things like LLVM that allow you to generate intermediate representations that get converted to a bunch of different instruction sets

For sure, i'm just telling the GP that their request for something that outputs assembly could be done with minor modifications depending on the assembly language they want to output.
gotcha. I also get the feeling that GP was more interested in being right vs actually knowing how to get the emitter to emit assembly. It felt like the followup was just gonna be ... well WHAT ABOUT BINARY?
Nah, assembly -> binary isn't that hard. ;)