Hacker News new | ask | show | jobs
by codesniperjoe 1246 days ago
Finally!

At least, someone finally understands that static, fully predictable, reproduce-able-builds are only an convenience feature for the attacker side.

4 comments

Please don't throw out the baby with the bathwater.

Fully reproducible builds provide great assurance against the supply chain attacks. But 100% reproducibility is in some cases a bit too much. What matters is whether the artifact can be easily proven to be functionally identical to the canonical one.

So I am 100% for a fully predictable sshd random-relink kit, producing unpredictable sshd binaries, but only as long as there is an instruction how to check that the sshd binary that allegedly came from it indeed could have come from it, and was not quietly replaced by some malicious entity.

> So I am 100% for a fully predictable sshd random-relink kit, producing unpredictable sshd binaries, but only as long as there is an instruction how to check that the sshd binary that allegedly came from it indeed could have come from it, and was not quietly replaced by some malicious entity.

You can easily verify the integrity of the object files that are used in the random relinking - they are included in the binary distribution, and are necessary to perform the relinking.

The debate of static vs dynamic linking is still going on, and a very strong argument against static linking has always been that upgrading vulnerable libraries is made difficult. But think of it: package managers already hold the meta-data of what links to what; object files can be distributed just as easily as shared objects; the last necessary step is to move the actual linking step from the kernel to the package manager.

On OpenBSD all static system binaries are compiled as static PIE, so they already benefit from ASLR. The issue, IIUC, is that ASLR only randomizes entire ELF sections relative to each other. In any executable or library, whether static or dynamic, the code is placed into one giant .text section, so the relative offsets remain static. In a dynamic executable all library dependencies are loaded separately, so at least each section of each library gets a unique base address. A leak that exposes the address of a function only leaks the address of other functions in that library, not every function in the process. But in a static executable all those libraries are also placed into the same .text section as the main program code, so a leak of any function address leaks all function addresses.

In theory all functions, or more realistically groups of functions spanning page-size increments, could be dynamically located. The obvious way to achieve that would be to have multiple .text sections within a main executable or library. But off-hand I don't know if that's actually supported by ELF, or if so whether the standard tool chains and environments could easily support it.

The ELF spec certainly allows for multiple .text sections, and one can also use totally custom sections with the correct attribute too.

Any linker that could not handle multiple identically named sections is simply buggy. That said, it is normal for a a linker to prefer to output only a single section of each name, but it is not difficult to get a linker to output multiple .text equivalent sections, especially if you make them have distinct names.

However section are not really what you want. PT_LOAD segments are, since those represent regions that get memmap'd contiguously. One can certainly put different executable sections into different segments.

I'm not 100% certain about how it works on OpenBSD, but on Linux, neither the kernel loader nor the loader embedded in the dynamic linker randomize the segments independently. The problem is that for dynamically linked code, the .text needs to be able to reference the GOT and PLT via relative addresses, so those segments must be loaded at a known distance relative to the code. For simple static PIE executables this should not be needed [1], however if you start introduce multiple chunks of code loaded at random addresses again, then you need to reintroduce similar concepts, as you cannot reference code in those other randomly placed chunks with a relative address.

Assuming things are at all similar in OpenBSD, to do what you are proposing, it would be needed to mark groups of segments that need to be loaded relative to each other, allowing other segment groups to be randomized with respect to each other. For code in one group to access globals or functions from the other, the linker would generate a GOT and PLT per group, similar to how dynamic linking works, but with simplifications since you know all the code that will be present, so don't need to worry about interposing, etc. In theory each GOT could get away with having as few as one entry per other segment group. [2]

Of course you would need code to initialize these GOT values. Realistically the static ELF loader would need to be augmented to provide the program with information about where it placed each segment group. Then the static PIE libc could include code that reads these offsets, and uses them to initialize the GOTs. If using the one entry per segment group approach and you place the GOTs a say the very start of each segment group, with the entries in segment group order, this would make for really simple initialization code. Of course, a more complicated relocation engine like a hyper stripped down dynamic linker would also be possible.

Footnotes:

[1]: Apparently on Linux even static PIE executables those have some amount of runtime relocation code that is needed (I'm not really sure what/why).

[2]: This is because the linker would know exact offsets of functions and variables within each segment group, so the code can simply load the other segment's pointer into a register using a relative addressing, and do the load/store/jump with that register plus the already known displacement into that segment group.

> You can easily verify the integrity of the object files that are used in the random relinking - they are included in the binary distribution, and are necessary to perform the relinking.

I don't understand the full logic here. Yes I can authenticate the object files. But how would you discern, after a possible intrusion, an "sshd" binary that is indeed a random combination of these objects, from a trojaned "sshd" binary?

A local package manager that performs the linking can save the hash of the result.
And the exploited program acquired root and changes the hash?
Limiting the scope of the damage that root can cause is an open problem, orthogonal to verifiable builds. OpenBSD has some basic checks in place (securelevel), but you should still assume that a compromised host is, well, compromised.

The weak link in reproducibility is that you currently have no trivial way of recreating the same random order of the linked object files.

Currently the random relinking is implemented literally through a call to "| sort -R" (-R for random order) on the list of object files, passed as arguments to the linker. I suppose if sort -R took a seed argument that was saved somewhere safe (chmod 400), the linking order can still be reproduced, and the resulting executable checksummed against the state of the system.

The hash can be uploaded somewhere. Alternatively a seed to randomize the executable can be derived from an immutable token like serial number.

Yet another solution is to re-sort the executable into a stable order and compare the hash of that.

It’s not that finally. OpenBSD got kernel boot time relinking in 2017 (https://marc.info/?l=openbsd-tech&m=149887978201230&w=2). This extends it to an outward-facing executable.

I guess the holy grail would be to combine this with hot patching (https://en.wikipedia.org/wiki/Patch_(computing)#HOT-PATCHING), and relink the kernel every now and then while it is running (currently, a system under attack would have to be rebooted every now and then, and that’s undesirable). That would face ‘a few’ technical hurdles, though.

Yeah I was just thinking this; I've got like years of uptime on my OpenBSD server--don't know how much boot time relinking is helping me. But for like, desktops and laptops, it's fine and a great feature IMO (you probably wade through a lot more muck on a personal machine)
If you have years of uptime on an openbsd machine you are not keeping it up to date.

I have to admit I am guilty of this as well, but any mantained openbsd setup should have an uptime of no more than 6 months and a well maintained openbsd setup will be shorter than that as security patches are applied.

Having said that one of the things I like about openbsd is that if you want to go dark and have an ultra stable system(no updates ever) all the pieces are there for you, (you will want to have the source, I would also make sure I have the ports tree for that release and a copy of the ports dist files.)

This is true; my VPS has some kind of problem updating a FDE machine and I've procrastinated doing something about it for years. The answer is probably putting everything on tarsnap and reinstalling.
It is possible to get your reproduceable build, 100% identical, and then random-relink it.
I mean they're also great for being able to symbolicate crash logs and such.