Hacker News new | ask | show | jobs
by mforney 2185 days ago
You bring up some good points here. Here are some of my experiences with these problems when working on oasis (my static linux distro).

> 1. symbol collisions -> accidental interposition (and crashes);

I've encountered symbol collisions only twice, but both resulted in linker errors due to multiple function definitions. I'm not sure how this could happen accidentally. Maybe you are referring to variables in the common section getting merged into a single symbol? Recent gcc enables -fno-common by default, so those will be caught by the linker as well.

> 2. you have to flatten the dependency tree into a topological sort at the final link-edit.

Yes, this is pretty annoying. pkg-config can solve this to some degree with its --static option, but that only works if your libraries supply a .pc file (this is often the case, though).

I think libtool also can handle transitive dependencies of static libraries, but it tries hard to intercept the -static option before it reaches the compiler so it links everything but libc statically. You can trick it by passing `-static --static`.

For oasis, I use a separate approach to linking involving RSP files (i.e. linking with @libfoo.rsp), which really are just lists of other libraries they depend on.

> Besides fixing these issues, the C dynamic linking universe also enables things like: > - run-time code injection via LD_PRELOAD and intended interposition

Yes, this can be a problem. I wanted to do this recently to test out the new malloc being developed for musl libc, but ended up having to manually integrate it into the musl sources instead of just using LD_PRELOAD.

> - run-time code loading/injection via dlopen(3)

In particular, this is a big problem for scripting languages that want to use modules written in compiled languages, as well as OpenGL which uses dlopen to load a vendor-specific driver.

> Dynamically-linked programs will load faster when their dependencies are already loaded in memory, and slower otherwise. The biggest win here is the C library.

But doesn't the dynamic linker still have to do extra work to resolve the relocations in the executable, even when the dependency libraries are already loaded?

2 comments

> > 1. symbol collisions -> accidental interposition (and crashes);

> I've encountered symbol collisions only twice, but both resulted in linker errors due to multiple function definitions. I'm not sure how this could happen accidentally. Maybe you are referring to variables in the common section getting merged into a single symbol? Recent gcc enables -fno-common by default, so those will be caught by the linker as well.

No, this comes up all the time. Try building an all-in-one busybox-style program, and you'll quickly run into conflicts.

If static link archives had all the metadata that ELF files have, then the link-editor could resolve conflicts correctly. That is the correct fix, but no one is putting effort into it. The static linkers haven't changed much since symbol length limits were raised from 14 bytes!

> > 2. you have to flatten the dependency tree into a topological sort at the final link-edit.

> Yes, this is pretty annoying. pkg-config can solve this to some degree with its --static option, but that only works if your libraries supply a .pc file (this is often the case, though).

pkg-config alleviates the problem, but it's not enough. Among other things building a build system that can build with both, static and dynamic linking is a real pain. But more importantly, this flattening of dependency trees loses information and makes it difficult for link-editors to resolve symbol conflicts correctly (see above).

> > Dynamically-linked programs will load faster when their dependencies are already loaded in memory, and slower otherwise. The biggest win here is the C library.

> But doesn't the dynamic linker still have to do extra work to resolve the relocations in the executable, even when the dependency libraries are already loaded?

It's still faster than I/O. (Or at least it was back in the days of hard drives. But I think it's still true even in the days of SSDs.)

> But doesn't the dynamic linker still have to do extra work to resolve the relocations in the executable, even when the dependency libraries are already loaded?

For dynamic linking there's usually only a single reference that needs to be fixed up in the PLT or GOT for each referenced symbol and the fix-ups are all localized, so that part is typically a very small cost. But this means every call to a dynamically linked library goes through an extra stub function, which adds I$ and BTB pressure compared to static linking.

(If you do things manually with dlsym there's also an extra indirection but it's a little different mechanically since you do an indirect call of a function pointer rather than doing a direct call to a forwarding stub. The advantage of the forwarding stub is that even if there's a BTB miss for either of the two direct calls, you don't suffer a branch mispredict but just a few front-end cycles waiting for the instruction decode, which VTune labels as a "branch resteer". The direct call option also leads to smaller code size for each call site which helps with I$ pressure. Basically it's the difference between CALL rel32 and MOV tmp, [RIP + disp32]; CALL [tmp].)