Nice! I think doing this is a great learning experience. I think I understand how to write a basic compiler but I am completely stumped by writing an optimizer. How do you model that?
You can play around with how it takes in llvm code to optimized llvm code by using the opt tool.
Here's a basic example.
$ cat test.c
#include <stdio.h>
int main() {
for (int i = 0; i < 3; i++) {
printf("hello world");
}
}
$ clang -c -S -emit-llvm test.c
$ cat test.ll
** too long replacing with gist**
https://gist.github.com/cwgreene/b6f33d40ad735a448ed15057d91fdbdc
$ opt -S -print-before-all -O2 test.ll
** too long, replacing with gist**
https://gist.github.com/cwgreene/99b3075dbfffae35a5745934ded217fb
** final result**
@.str = private unnamed_addr constant [12 x i8] c"hello world\00", align 1
; Function Attrs: nounwind uwtable
define i32 @main() #0 {
entry:
%call = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([12 x i8], [12 x i8]* @.str, i64 0, i64 0)) #2
%call.1 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([12 x i8], [12 x i8]* @.str, i64 0, i64 0)) #2
%call.2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([12 x i8], [12 x i8]* @.str, i64 0, i64 0)) #2
ret i32 0
}
The command `opt -S -print-before-all -O2` can be manually modified to specify exactly which compiler passes you want to see. You can run `opt --help` to see a list of all compiler passes. (you can figure out what code performs by looking at the * * * IR Dump Before `MODULE NAME` * * * and finding the associated logic in the Lib directory)
In the case of llvm, the idea for them is they break the code into things called 'basic blocks' which, along with using Single Static Assignment, are more easily analyzed, transformed, and simplified.
It's a bit idiosyncratic but Steven Muchnick's "Advanced Compiler Design and implementation" is something of a bible for optimising compiler design. Of course, other books are available.
I haven't look around recently enough to name any off-hand but there are some really good free textbooks/monographs(etc.) on optimisation around. Search "ssabook", there's a really great data flow analysis book out there somewhere which if I find I will edit and include.
https://github.com/llvm-mirror/llvm/tree/master/lib/Transfor...
You can play around with how it takes in llvm code to optimized llvm code by using the opt tool.
Here's a basic example.
The command `opt -S -print-before-all -O2` can be manually modified to specify exactly which compiler passes you want to see. You can run `opt --help` to see a list of all compiler passes. (you can figure out what code performs by looking at the * * * IR Dump Before `MODULE NAME` * * * and finding the associated logic in the Lib directory)In the case of llvm, the idea for them is they break the code into things called 'basic blocks' which, along with using Single Static Assignment, are more easily analyzed, transformed, and simplified.