Hacker News new | ask | show | jobs
by Falell 1656 days ago
Sounds complicated, but the goal of a fuzzable/sanitizable/etc libc sounds nice.

Lack of ABI stability sounds terrifying as an application developer. My other immediate thought was "how will this interact with systems where the OS-provided libc is the only stable way to e.g. make syscalls", and "Layering Over Another libc" addresses this. I guess the idea is you'd link an application against llvm-libc and the system libc, and ship llvm-libc with your application?

3 comments

If your software is in the distro repos, the maintainer of said package queues it to be rebuilt when the distro's llvm-libc package is updated.

If you're providing the packages yourself, it's up to you to do that yourself.

Or yes, you can vendor libc in your package. Not something everyone will like, but it depends on who your users are.

It's not like this is unusual. Binaries compiled against today's glibc can fail to run on a machine that hasn't been updated since last week because they rely on a new / different symbol. Rebuilding the distro's packages when their deps are updated is standard fare.

> Binaries compiled against today's glibc can fail to run on a machine that hasn't been updated since last week because they rely on a new / different symbol.

Note, however, that it is a Glibc bug (modulo Drepper’s temper) if the reverse happens: Glibc symbol versioning ensures that binaries depending on an old Glibc (only) will run on a new one. So the proper way to build a maximally-compatible Linux executable would be to build a cross toolchain targeting an old Glibc and compile your code with it. Unfortunately, the build system is hell and old Glibcs doesn’t compile without backported patches, so while I did try to follow in the footsteps of a couple of people[1–5], I did not succeed.

Mass rebuilds still happen with other ecosystems, though. GHC-compiled Haskell libraries are fine-grained and not ABI-stable across compiler versions, so my Arch box regularly gets hit with a deluge of teensy Haskell library updates, and Arch is currently undergoing a massive Python rebuild (blocking all other Python package updates) behind the scenes as well.

[1]: https://github.com/wheybags/glibc_version_header (hack but easy and will probably work most of the time)

[2]: https://www.lordaro.co.uk/posts/2018-08-26-compiling-glibc.h... (someone’s mostly-nonhackish effort)

[3]: https://github.com/pypa/manylinux (what Python manylinux wheels use, more modern than absolutely necessary)

[4]: https://github.com/FooBarWidget/holy-build-box (ditto and is also a complete opaque cross-toolchain build recipe, but apparently people use that)

[5]: https://casualhacking.io/blog/2018/12/25/create-highly-porta... (missed it last time, so can’t say much)

> Note, however, that it is a Glibc bug (modulo Drepper’s temper) if the reverse happens: Glibc symbol versioning ensures that binaries depending on an old Glibc (only) will run on a new one.

But only up to a certain point.

Just the other day I wanted to run the old Ballistics game with it's 2007 binary on a modern Ubuntu. All I got was

    ballistics/lib/lib1/libm.so.6: version `GLIBC_2.29' not found (required by /usr/lib/i386-linux-gnu/libasound.so.2)
This sounds like the opposite, actually (in part, and in part like an instance of my “only” caveat above—many things you can’t do with Glibc alone, and other people are much worse at versioning): it’s bundling its own old libm (part of Glibc) instead of using the system one, but at the same time is trying to link to the system libasound, which expects a new libm and predictably fails (note that only one libm can exist in a given process, though different modules can refer to different symbol versions within).

The Ballistics packaging people got it exactly backwards, in other words: Glibc is the thing you least want to bundle unless you’re bringing the entirety of the environment with you (including things like libGL and libX11). Try just removing the offending libm, maybe? Then the loader should probably fall back to the system one, given that it’s finding a system libasound, and that’s what you want.

> it’s bundling its own old libm (part of Glibc) instead of using the system one, but at the same time is trying to link to the system libasound, which expects a new libm and predictably fails (note that only one libm can exist in a given process, though different modules can refer to different symbol versions within).

It may have been newer when than the system provided one when it first shipped. Sadly you can't tell the dynamic linker to just load the newest version of a library. It just loads the first it finds and that breaks once the system provided version is newer.

I agree that ballistics/lib/lib1/libm.so.6 should probably be removed, praying that devs didn't alter it.

Shared libraries are nice for forward compatibility: see libsdl1.2-compat, libaoss, etc.

Thank you, that indeed made it start!
Bundling your own glibc components will break eventually anyway even without other system libraries: glibc does not guarantee compatibility between components across different versions and (barring containers) you can't bundle all of glibc because the dynamic loader needs to be at a fixed absoulte path (/lib64/ld-linux-x86-64.so.2 for glibc-based amd64 Linux distros).
I'm not seeing from the error message how that qualifies as a glibc symbol compat issue as such. On the face of it, it looks like the application is vendoring its own libm rather than using the system glibc's libm, and then it tries to load other system libraries which expect to load the newer system libm and instead find an older one.

If my interpretation is correct, then if it's going to load system libraries, those may require system glibc, and if it's going to use system glibc, it should use all system glibc rather than trying to mix and match.

>Note, however, that it is a Glibc bug (modulo Drepper’s temper) if the reverse happens

Right right. I gave that example not as something that people would expect to work, just as something that indicates that users and distros are used to the idea of binaries and libc being revved in sync.

Glibc symbol versioning has only prevented programs from running. I disable it on all of my systems.
Cosmopolitan Libc is sanitizable. I believe it's currently the only one where you can use Address Sanitizer at all layers of the software stack.
Most standard C library features don't contain implementation choices that have different ABIs. The types of the function arguments determine everything, so the only way to have instability is to tinker with the compiler or its options that influence the ABI at a low level.

They must be thinking of some very specific functions.

In <stdio.h>, functions that are implemented as macros can peek at the FILE * structure, so if that's not maintained in a backward-compatible way, that would be a problem. (In that case, if you #undef the macros to reveal the real functions, you're almost certainly OK. C programs do not declare or initialize FILE objects.)

struct tm could cause issues; if hidden fields are added to it, which existing binary clients don't define.

Various things in POSIX can have a problem also; it has a lot of structures, the storage for many of which are defined by client programs, and in some cases even initialization.

Standard C has an ABI problem due to intmax_t. It is stuck at 64 bits in most implementations even though many offer an int128_t. There are standard C functions such as imaxabs, imaxdiv, strtoimax that are defined as taking or returning intmax_t, so changing intmax_t to 128 bits would break existing binaries. C functions are also purely defined by their name so you wouldn’t even get a dynamic linking error, you would just be calling the functions wrong.
That’s what happens when your change your hardware architecture but don’t want to change your ABI; you get crazy stuff like that.
This is true only if you ignore practical considerations like symbol versioning and migrating (e.g. the latest pthread glibc changes that aren’t backwards compatible).