Hacker News new | ask | show | jobs
by EllipticCurve 2239 days ago
This is my first compiler, so I expect there to be several things I can improve :)

I tried to follow the Assembly calling conventions the best I could.

I am looking forward to any feedback!

6 comments

Sorry I won't be able to give you a lot of feedback.

As a matter or preference I really like colons and semi-colons.

That said your work is amazing. This is a true example of simplicity. I don't think most of people would get how difficult it is to keep it simple.

Everything is clear I can read the source code without asking myself "what is that", everything makes sense.

Thanks for presenting your work.

Do you have any constraints, like "no meta programming", "generated library should be as much as possible compatible with C", "it should have one pass optimization" or even "the compiler must be embeddable in most place as possible" ?

Nothing wrong with semi-colons. Everyone has a different preference anyway.

Thanks for saying all that. It was a huge amount of work and getting appreciation makes it all worth it!

No hard constraints as of now. But I don't think I want to include meta programming or a pre-processor (don't really like it to be honest). I do want to keep it compatible with C internally, on Assembly level. One thought is, to create a file with function headers/definitions that are then dynamically linked and can just be used.

I used some C std library functions that way for debugging (printf, ...). And as I follow the standard calling conventions, the compiler should automatically generate compatible code.

With this, it would also be possible, to write OpenGL code. That would be really awesome :)

As of usage of my language - Not sure yet. Up until now, the road was more of the goal then the finished language.

Congratulations! It looks like a useful and practical design. How is the performance of the generated code? What are you thinking about doing for memory management? Have you thought about using an intermediate representation to make optimization and retargeting easier?
Thanks :) That's where I was trying to get to. I did some smaller performance comparisons against C (with -O0), where I was at about 90% speed. But there is a lot of performance to gain, if I optimize the resulting assembly. There are lots of cases where I push and pop directly afterwards because of the general expression code generation (no real knowledge of broader context). So I expect that to help a lot regarding performance. Also things like jump tables for simple switches are on the table.

Yes, I thought about going for LLVM or another representation but decided to do it once myself (no given performance optimizations or the like) with room for improvement.

Very cool! Reading the "Why" section resonated with me, as I think creating one's own language is something every programmer should do for the experience.

The syntax flows well, in fact it feels very intuitive for my taste. I like the type inference and no semicolons. I wonder if the latter posed any trickiness, for example, with the next line starting with an expression "(" or operator like "+".

I'm also curious about use cases, what is possible with this language. I guess anything assembly can do - which is..everything? :) Would it be suitable to run on microcomputers like Raspberry Pi?

EDIT: The Pi and Arduino are typically ARM, it seems, with a different instruction set. Well, that shows how little I know.

:) I (now) agree. In fact, each part isn't even really that hard. It's mostly just a lot to work. And then code generation. I had some headaches with multiple return values and keeping the calling conventions intact... And then with structs as well.

That language flow and general simplicity was one of my most important goals. Thanks for noticing :)

No, I had no problems regarding that. What you mention ('+', '(') are all part of simple expressions when parsing. And I strictly parse right recursive and re-order the expressions later (for operator priority). So that was not an issue. Most of these problems I solved, by making my parser a lookahead of >1. In a few cases, there is a lookahead of 3 to determine what exactly should be parsed.

I guess anything, that can run an X86-64 Elf executable? ;) Although there is still a lot missing, for it to be taken serious. Starting with strings, files, input, ... But thats for another time or whenever I need it, I guess.

Thank you for this. I like that it's small enough to be motivational instead of being complex and intimidating.

World needs more of these for other complex topics.

That is really motivating, thank you for saying that!
Very cool. What resources did you use to learn how to write the compiler? The "turn code into ASM" step has always mystified me, and I'd like to learn more about how that part of the process works.
Those are some of my currently open tabs :) Lots of Google use on top of that. The parser is actually quite straight forward. The much harder part (for me) was the code generation afterwards (No experience with Assembly so far).

- General compiler design - This was one of the main ressources. https://www.tutorialspoint.com/compiler_design/compiler_desi...

- https://www.lua.org/manual/5.3/manual.html#9

- https://www.godbolt.org/

- Linux system call table: http://blog.rchapman.org/posts/Linux_System_Call_Table_for_x...

- A bit on floating point: https://cs.fit.edu/~mmahoney/cse3101/float.html

- Assembly https://www.cs.yale.edu/flint/cs421/papers/x86-asm/asm.html

- More assembly: https://www.complang.tuwien.ac.at/ubvl/amd64/amd64h.html

very cool. I'm writing a compiler and language myself and I was on the fence about anonymous structs / tuples vs multiple returns, and seeing multiple returns in your examples nudged me that way.

I'm also influenced by Lua and I picked it up in your grammar right away : )

It's a bit of pain to implement in some cases, but worth it :)

Let me know, if you have questions or need hints where to find some relevant details in my code.