|
|
|
|
|
by mdcox
3929 days ago
|
|
I'm really curious as to what exactly you mean by this...does the same code through the same compiler not reliably produce the same binary? I know very little about actual compiler mechanics, but non-deterministic compilation seems really strange to me. |
|
If you compile a "hello world" type C program you should get the same binary when you compile it again provided your toolchain (C library, C compiler, linker etc) are all the same.
However certain C macros like __DATE__ make the binary change (in this case based on the time of compile). Additionally sometimes environment variables like your working directory and your username get into the binary.
Why is this bad?
If the build server for Debian gets hacked or if a developer's machine gets hacked (for some projects), the hackers can modify the binaries. If the program is not reproducible then there is no way to tell that something has gone amiss. If the program can be built reproducibly, someone else can build the code, produce the same binary, and validate it.
This is more scarier in the case of a "Ken Thompson" style hack, where the C compiler binary is modified so that it compiles normally but inserts backdoors in certain libraries, and also inserts its modifications whenever it is building another C compiler.
If the "Ken Thompson" style hack is ever pulled off on a linux distro, there would be no real way to tell without analysing the binaries.
Provided your initial C compiler is good. Having a chain of reproducible builds where each build produces the same binary would prevent against this. Currently we are just producing random binaries and relying on trust which is horrible.