Hacker News new | ask | show | jobs
by pitherpather 874 days ago
I don't know how clean or simple or portable you wish your build environment to be, but would it be worth embedding at compile-time?

Thinking of the general need in these situations to produce a custom-named interpreter/executable, could it be worth accessing the program name itself to find a paired source file? E.g., in invoking ./foo2.o it would look for ./foo2.code -- a two-file distribution allowing to double-click on the executable??

Could there be a non-unicode flag at the end of a special elf file which allows arbitrary unicode data to be concatenated after that? I.e., an agreed loader-ignore-hereafter convention or similar? (Asking with no knowledge of ELF internals, besides hints given in the OP.)

1 comments

Using two files would definitely work and, honestly, be a lot simpler. But it's a neat trick to make it a single file. For my toy language, it mostly serves to hide the fact that I'm not really compiling to native. People won't ask questions if it's a single executable that file(1) says is a statically linked binary.

Appending to the end of the ELF file does work. It won't mess anything up because your bytes will be outside of any ELF section. You can insert a known sentinel string and then search for it at runtime. The main problem is that you have to open your own executable file up for reading so you can locate the data at the end, and on Linux that requires having /proc, AFAIK. The nice thing about these other techniques is we're not assuming anything about the filesystem we're in. In a chroot environment you might not have /proc.

Given your pursuit of elegance, I imagine you could ultimately have a --clone or --cloner command-line switch which would allow any executable instance based upon your interpreter to create a new executable instance, but encapsulating newly-supplied source code. In this sense your interpreter could go viral. (In tcl/tk context, freeWrap might be an example for study.)

Relatedly, I don't know whether, given argv[0] and your targets, one can at least copy the named file, even if one cannot open it directly for reading.

The first arg to your program is a path to the binary that's being executed. No /proc required.
It's usually the path to the binary being executed, but you can pass anything you want when you exec. e.g. execl("/bin/ls", "definitely not /bin/ls", NULL);