Hacker News new | ask | show | jobs
by vortico 4159 days ago
I wonder if one could write a header file containing definitions that would make the following possible.

  char main[] = {
  	movl(1, eax),
  	movl(1, ebx),
  	movl(message, esi),
  	movl(13, edx),
  	syscall(),
  	movl(60, eax),
  	xorl(ebx, ebx),
  	syscall(),
  };
Obviously there are some technical difficulties like handling literal values and code sections, but it could be a fun hack, and I've love to see what someone could come up with.
2 comments

With enough macros you can do anything. I'm not aware of anyone doing this in C, but this is a cool example of doing approximately the same thing in a more easily extensible language: http://wall.org/~lewis/2013/10/15/asm-monad.html
Being able to compile assembly directly into a C program is a very useful tool for allowing optimizations across embedded assembly. One way around it is use of intrinsics: http://en.wikipedia.org/wiki/Intrinsic_function which is basically what we're talking about.
It's possible, but once you're not obfuscating the code - and already depending on OS details when you assume that global data can be executable, whether const or not - I think you may as well just use the inline assembler feature of your compiler. Well, if there is one... MSVC bizarrely doesn't support on x86-64 what was a perfectly good feature on x86.
It might be an easier way to produce obfuscated code by reading the preprocessed output. But I see your point---it's reinventing the wheel in a different way. That's the fun of it of course.
It's not bizarre: inline asm hurts performance and requires more effort from compiler authors than is worth it in our modern age of intrinsics.
I admit I'm not too familiar with the old MSVC inline assembly system, but the way GCC/Clang do it certainly allows emitting identical code to what you can get with intrinsics, although using naive constraints might hurt you. However, the main reason I personally use inline asm is not for performance, but to access instructions which are not provided as intrinsics or require special register handling. For example, I recently wrote some code that did syscalls directly (because it was patching memory so the normal syscall functions might not be accessible); I could have linked a separate .S file, but inline assembly made the output look much nicer. Or, while I haven't written this myself, it can be used to write out nops which zero-cost tracing facilities will then patch, something which separate assembly files cannot do.

As for effort from compiler authors, I'd like to hear more about why it is supposedly so hard. (MSVC is also the compiler which has to prioritize their choice of C++11-17 features to implement over the years, while the competitors have complete C++11 and 14 implementations. I guess that's somewhat ad hominem on the team, but ...)

Theoretically they lead to identical code, but I believe a block of inline ASM results in constraints on surrounding code and prevents some types of compiler optimisation.