Hacker News new | ask | show | jobs
by aidanhs 1673 days ago
This is missing a lot of detail around libc (can't comment on the others), but was published ~1.5 years after musl was first released - I can imagine the subtleties being missed in a glibc monoculture.

To expand a bit:

1. dlopen has interactions with static linking that you should understand if you use both at the same time. Different libcs expose different symbols for libc functions, so building your dynamic library against one libc may make it incompatible with another. And two libcs in the same program (one from the dylib, one from the binary) is a recipe for a very bad time - imagine freeing a pointer allocated with a different malloc implementation, or a different layout of libc structs

2. the complexity with glibc is it invokes dlopen as part of NSS, a feature that gets used as part of looking up users, among other things (it does this to allow integration with LDAP etc). You can actually see warnings at link time if you statically link glibc but pull in symbols that use NSS. You can disable NSS if you like at build time of glibc itself [0] (i.e. not your binary)

3. musl doesn't have the complexities of NSS and can be statically linked happily by default (I suspect that's probably what musl is most used for)

[0] https://sourceware.org/glibc/wiki/FAQ#Even_statically_linked...

3 comments

> And two libcs in the same program (one from the dylib, one from the binary) is a recipe for a very bad time - imagine freeing a pointer allocated with a different malloc implementation, or a different layout of libc structs

Depends on the exact architecture of the program, y'know. I remember a Windows app written in Delphi (so no libc whatever) that supported loading plugins written it whatever; and I've seen it load simultaneously plugins that linked against msvcrt90.dll and msvcrt100.dll with no problem. But that worked because plugins were required to export a FreeObject function that is supposed to be used to free whatever structures a plugin returned from its other functions. The main program tracked what piece of data came from where and called proper deallocation functions.

But that requires acknowledgement of a fact that there is no "single, global C runtime".

> But that requires acknowledgement of a fact that there is no "single, global C runtime".

On Windows, yes, giving rise to situations where you have a dynamic library and the program using it, linked against two different C runtimes. On a Unix like OS you usually have a single, global libc. You would have to deliberately link it statically into a shared library, or a program that uses a shared library, to create a similar situation.

For the standard C use case, if the library malloc()s something and passes the result pointer to the program, the program itself has no way to free it. The two have their own, independent copies of the allocation code that don't share the allocation arena. If the program calls free(), the best case scenario is that the libc of the program realizies it doesn't recognize the pointer, yells "double free" and abort()s.

That's a bit of a portability pit-fall and definitely something one has to consider when designing a library ABI. Pointers returned from the library have to be passed back into the library to free the underlying memory. A solution like the one used by Delphi is required. (For polymorphic objects being passed around, you would typically have something like a destroy() hook anyway).

I'm a bit hesitant to make a guess on what happens in the case of a C++ smart pointer. Technically they can have an allocator, but does that cross the ABI bounds? Is there a template specialization for optimizing it away if you are using the default delete operator? I guess the compiler of the program could also end up instantaiting the programs allocator when inserting the template parameters?

As info for others, Windows, some UNIX clones like Aix, and I guess mainframes/micros, do use namespaces for the dynamic libraries.

So funcA() from a.dll and funcB from b.dll, are actually resolved as a!funcA and b!funcB, hence why linking against msvcrt90.dll and msvcrt100.dll works without major issues.

macOS does this too.
Yes. In TruffleRuby we have to ensure that we free memory returned by FFI calls using the correct free implementation, because that’s not necessarily the one being used internally by our runtime.

It’s annoying but it’s something you have to handle.

As for supporting multiple DLLs using different run time libs, I’d guess that only works because they are slightly more complex than simple shared libs and can sort out thread local storage and the like during thread attach. Without that I can imagine things going badly wrong.

> imagine freeing a pointer allocated with a different malloc implementation, or a different layout of libc structs

Should libraries be allocating memory to begin with? They should simply provide the data structures. The main program should decide whether to allocate on the stack, statically or dynamically.

Wait until you learn about (dynamically-loaded) libraries that start up worker threads and stuff without any notification. You do your stuff with them, then call FreeLibrary and suddenly some random thread crashes because its text is no longer mapped and brings your whole program down. Fun stuff!
Jesus. I wonder how much time it took to debug that...

Libaries shouldn't do anything unless called explicitly. Even when called, they should cede as much control as possible. They should probably be exposing the concurrent functions so the caller can decide how to thread them... Every time a library decides to make things easy for the user it leads to insane problems like these.

Nah, it's a rather basic problem and the solution is well known: do another LoadLibrary on the already loaded library. Of course, then you need a Free­Library­AndExit­Thread... [0] And the posts following [0] (you can see them at the bottom of the page in the "Read Next" section) explain more of the context.

All this stuff is there because of the built-in threading support in COM which, arguably, was pretty convoluted for some rather strange reasons ― what's the point of such precise RAII-ing of the DLLs? I guess the address space was really precious back then or something.

[0] https://devblogs.microsoft.com/oldnewthing/20131105-00/?p=27...

Mix in some signals and you are on the fast track to debugging purgatory.
I guess strdup() kind of answers that.
> 1. dlopen has interactions with static linking that you should understand if you use both at the same time.

Using dlopen when static linking is used is deprecated in glibc.