Hacker News new | ask | show | jobs
by asveikau 1106 days ago
I think you totally missed what the issue is here. In that scenario, Apple could still patch a system lib and it can break your application.

It's not a question of the library updates being untrustworthy and code signing by the vendors fixes it. It's the library updates themselves breaking shit, not intentionally.

Static linking prevents that, at the cost of disk space and memory and missing out on updates that might not (usually won't) break your app.

Otoh, if you told me apple is more careful about breaking ABIs with updates to shared libraries, that is believable.

3 comments

It's not a matter of "could". Apple DID this in the past.

In the mid-2000s, GCC switched C++ ABIs. Apple pushed out an update to Mac OS X 10.3 which replaced the static libstdc++ using the old ABI with a dynamic libstdc++ using the new ABI. This broke C++ compilation under Mac OS 10.3. They did not bother to update the compiler to use a new ABI; they had decided that if developers still wanted to compile for 10.3, their supported path -- compiling using the latest Xcode (available for 10.4 and up only) in its 10.3 backward compatibility mode was good enough. People still using 10.3 who wanted to build e.g., GNU software were just too niche a use case to warrant any kind of support.

Lesson learned: if you intend to build software on Apple systems, always have the latest shiny, or you risk being left in the lurch.

Not even the last time they did something that. Just in the past six months, I noticed I was no longer able to link against certain C++ libraries being installed via Homebrew, because some change in the ObjC runtime meant that whatever I was running was unable link libraries built with newer compiler versions (various ObjC symbols would be missing) - presumably the binaries installed by Homebrew had been built on a newer system - and the current Xcode tools wouldn't install on Monterey, you had to upgrade to Ventura. Meh.
g++ ABIs changing in the 2000s was pretty painful on linux too.

But I think there is a fair point that arch linux probably doesn't care much about what breaks on its rolling updates to shared libs. I would guess you'd get less of that on, say, Debian.

>arch linux probably doesn't care much about what breaks on its rolling updates to shared libs.

This is a misconception. I'm not saying it never happens, archlinux is run by volunteers, resources are limited, but contrary to popular belief it isn't always the place that gets updates the fastest and when things break it's because upstream really screwed up and let something bad in that went unnoticed. If anything is noticed at all, arch will play the conservative card.

Bash is still on 5.1.x on archlinux because Bash 5.2 (that was released, by the way, almost a year ago!) introduced compatibility issues with older scripts that abused some bashisms (this makes a very solid case for only writing shell scripts in pure posix. Use shellcheck and checkbashisms.) Meanwhile almost every other distros jumped on 5.2 by now.

Arch will upgrade to the latest and greatest when changes are minor. We get the official upstream point releases that fix bugs and security issues fast. It does not mean Arch doesn't pay attention when it matters.

Most often when something break on archlinux it is because people aren't following the instructions from the arch-announce mailing list when there is a major transition, because arch is not a distro that automates processes. When grub broke for some users recently, it was because they didn't run grub-mkconfig, which should always be done when upgrading grub because grub has a very funky configuration file format (the config file you write is like a "source" for it to generate another, different config file that gives you NO guarantee about its format staying stable). Note that upgrading the grub package does not, in any case, ever upgrade grub itself, because archlinux is a distro that does not automate things for you. So people who had a broken grub did this to themselves, they ran grub-install but didn't run grub-mkconfig. Anyway, I would recommend systemd-boot to any user on modern EFI systems because it was written by much saner people.

Arch Linux's core philosophy isn't really about having the absolute latest packages. There were times when a Fedora came out with the latest Gnome before the latest version reached the stable repositories in Arch.

Arch's philosophy is to give you a Keep It Simple, Stupid system. There is no automation beyond what the software packaged provide. There is no splitting packages into many tinier packages like other distros that are very annoying with their -doc, -dev and so on packages (disk space is so cheap, why would you NOT want the documentation to be present??). You don't have to hunt for funky names of dev packages when you need to compile software, if you already have the dependencies installed, they also come with what you need to build against them. The arch packaging format is the simplest of all distro, with the exception of slackware which does not have dependency management. What few system tools exists to manage the distro are all written in shell scripts (mkinitcpio, arch-chroot, pacstrap, pacdiff, mkarchiso..) except for the package manager, pacman, being in C. You could say that being a rolling release is a side effect and not the main purpose. Since there is no automation and it is a simple system on an architectural level (not as in "user friendly"), adapting to gradual changes is much less painful than doing major releases every once in a while like debian stable because with the KISS philosophy it would compress a serious amount of work to do on the side of the user to handle so many transitions at once on their own.

Debian's care about whether things break or not, that you mention, depends on which edition of debian we're talking about. Debian stable cares a lot about breakages and upgrading from a stable to another is always a smooth process. They go to painstaking lengths to make it work.

Debian unstable on the other hand.. I've had times when they introduced new library versions that broke many other packages and I found myself unable to install software I wanted to try because the repositories were broken in that way. Debian unstable is not the "rolling" alternative some people think it is. If you want a rolling distro, use a rolling distro. Ever since I tried arch many years ago, I left unstable and never looked back. A library that breaks many packages that are part of the arch repositories will not leave testing. They can however break things that are from the AUR, there is no official support for the AUR pkgbuilds.

The GP specifically talked about an inadvertent dylib hijacking, which is prevented by the mechanisms I described. You are talking about a platform ABI break, which while unfortunate does occasionally happen due to significant technical issues (or sometimes by accident).

Apple does spend a significant amount of effort to avoid breaking supported ABIs. There have definitely been issues though, and especially early in Mac OS X while learning how to deal with upstream open source projects that don't care about ABI. In this specific case the result was Apple funding the development of libc++ and factoring it into libc++ and libc++abi specifically do prevent this sort of breakage in the future. Another example would be about a decade ago when Apple removed the ssl headers for the SDKs and told developers to either use SecureTransport or include their own SSL libraries, since depending on openssl's ABI was not feasible.

> The GP specifically talked about an inadvertent dylib hijacking

You're wrong.

> because sometimes package managers update dynamic library dependencies without actually checking if binary compatibility was preserved in the new version,

That is a very clear description of an ABI break.

They explained an issue they had on Linux, which was that he installed a package with a library that broke ABI with a client. I explained how it is rarely an issue on macOS because the risks of it are limited by how the system is constructed (immutable base system, so no mix and match package issues) and how linking policy is configured (by default binaries can only link to the embedded dylibs referenced in their bundle's code signature or platform binaries). So yes, I phrased my response in the context of a specific subset of the issue (running code from the wrong library) because all of the rest of the ways that happened are (modulo bugs) prevented by construction.

Package managers do not modify the base system on macOS, and the binaries installed by them will not have code signatures will not be trusted by executables built with the default ecosystem's policy, so they cannot impact binaries outside of their control unless those binaries have been specifically opted into a reduced runtime mode, which makes the change of an ABI break way lower (but not 0) than on Linux... I don't even understand what is controversial about that statement, the systems are built with different goals and engineering trade offs.

And before you tell me about how some system upgrade broke homebrew... of course it did. Many (most?) homebrew packages are opted into policies much closer to the behavior on Linux (flat namespaces, undefined dynamic_lookup, opted out of hardened runtime) because they depend on Linux like semantics due to being multi-platform. That also means they get none of the protections afforded by the system to prevent these issues. If they adopted two level namespaces, @rpath, limiting usage to APIs available in public SDK headers, and using min deployment targets they would be fare more resilient to system upgrades, but that would also entail an ongoing maintenance burden for packages primarily developed on and for Linux.

IOW, if your point is dynamic linking provides primitives to build fragile systems, then sure, I agree. If your point is all dynamically linked systems are significantly more fragile than statically linked systems I disagree (though I can actually point out cases where ABI mismatches have occurred in both static and dynamic binaries). If your point is that a system that allows fragile behaviors to be opted into by power users is inherently more fragile for normal users, then I also disagree (though I concede it may be more fragile for those power users).

A package update to an important shared library in a Linux distro is the equivalent to a system update on the Mac, if I read you right you are correct in saying that in a long-winded way.

I feel like you may have some Mac fetishism going on that is leading you to see those two as more distinct than they are. The topic at hand is an ABI break after a dynamic library gets updated legitimately by its vendor. Code signing and "injection" is tangential.

It is not fetishism to point out something behaves differently than Linux, especially in story about a technology introduced on Apple platforms.

Technologies don't exist in a vacuum. This thread pointed out a problem with dynamic libraries that cannot occur on Apple platforms because dynamic linking does not exist in a void but is part of an ecosystem that exists beyond the dynamic linker. The fact that in that context you keep insisting on ignoring any technology not present on Linux is sort of baffling.

> Static linking prevents that, at the cost of disk space and memory [...]

Some deduplication work for disk and memory pages should be able to help here?

(It might need compiler and linker support to produce binaries that are easier to deduplicate. Eg you might want to disable unused-function elimination for your static libraries and restrict whole program 'Link Time Optimization'? Or you might want your deduplicator to be a bit smarter and store diffs internally, like git does? Or your build system can work this way in the first place and produce diffs against common bases.

I don't know what's optimal or even practical, but you can do static linking and still save memory and disk space.)

In principle, yes? But it's a much more roundabout way of saving space. Reducing or avoiding optimizations like LTO or unused function elimination is at odds with minimizing binary sizes and maximizing performance. It's asking developers to prioritize the disk usage of the system as a whole over the performance of their own software.
You are right. But it's the same roundabout way that git is using.

Older version control systems like subversion used to store diffs.

Users of Git care a lot of about diffs between versions. And typically treat a specific commit as if it was a diff. Commands like 'git cherry-pick' re-inforce that notion.

However, internally each git commit is a complete snapshots of the state of the repository. Diffs are created on the fly as derived data.

Now even more internally, git does store snapshots as mostly as deltas. But to close the circle: those deltas have nothing to do with the diffs that users see.

This sounds very roundabout, but results in a simple and performant system, that doesn't ask the user to make compromises.

My suggestion for static libraries was along the same lines, if a bit clumsier.

I think deduping distinct but equal memory pages and marking them as copy on write is a relatively new feature in kernels. Easily 20 years later than shared libraries becoming common.
In this case you wouldn't need copy-on-write, because executable pages aren't writeable these days. https://en.wikipedia.org/wiki/W%5EX
I'd say from a kernel perspective they should be copy-on-write.

Firstly, the generalized kernel mechanism which scans for equal pages and de-dupes them [which by the way is disabled by default on Linux] probably doesn't care about if it's working on data or code; it seems like the primary use case at its introduction was for KVM, which, a kernel probably loads code pages and hence writes to them at some point, such as when it reads them from disk.

Second, someone can use mprotect(2) to make them writeable.