Hacker News new | ask | show | jobs
Linux kernel heap quarantine versus use-after-free exploits (a13xp0p0v.github.io)
119 points by kmwyard 2027 days ago
6 comments

I'm quite impressed with Windows 10's userland heap randomization. I never do any work on Linux, and especially not in the kernel. But maybe the authors can take some inspiration from the work there?

https://www.blackhat.com/docs/us-16/materials/us-16-Yason-Wi...

https://www.blackhat.com/docs/us-16/materials/us-16-Yason-Wi...

It's still not perfect, given the predictable nature of block coalescence. But it's quite a departure from the old manager.

Wow, I didn't realize they've introduced a newer heap implementation (seems to be in Windows 10 version 2004).

For anyone else wondering, it seems to be possible to turn on via a manifest:

https://chromium.googlesource.com/chromium/src/+/1a9f684dad2...

Now I'm curious if there's any way to turn it on dynamically from inside a traditional Win32 process.

Nope. It's been requested though https://github.com/microsoft/WinDev/issues/39

You can also just edit the embedded manifest of traditional win32 processes as a hack before runtime :/

Linux has had this a little longer, since 2005 or so:

https://en.wikipedia.org/wiki/Address_space_layout_randomiza...

However, even if the “heap is randomized”, that typically just means that the underlying blocks of memory (e.g. what mmap returns) have randomized virtual addresses. The allocator you use will still have to carve up those blocks from the OS—your kernel does not provide malloc(), after all.

Userland heap randomization on windows is something different than ASLR. Windows 10 actually randomizes heap layout, not just the offset, by filling larger chunks in random order. So even offsets between allocations don't stay constant.
Why doesn't Linux do that, it's kind of an obvious next step?
Because the Linux of today is not the product of a company or person with a focused vision but a conglomerate of devs from various foundations or corporations, each with their own agenda. A lot of things may seem "obvious" but depending who's at the helm, other stuff gets prioritized.

I remember, way back when, in college, when I would try to listen to music while my machine was torrenting a movie on the same HDD, the playback of the song would get totally borked because the queuing scheme of the linux I/O scheduler at that time could not cope with this(it was tuned for server operations) even though both Windows and MacOS could handle this scenario without a sweat since like forever.

IIRC BFQ scheduler finally addressed this later but I'm not 100% sure on this info.

What's the goal of the ASLR when software still can workaround it like e.g game cheats that "just" calculate those addresses?
ASLR is part of a larger suite of bar-raising security mechanisms [1]

[1] https://en.wikipedia.org/wiki/Defense_in_depth_(computing)

Once you can execute arbitrary code it's easy to figure out. The point is that it makes it harder to get a vulnerability to the point where it executes arbitrary code in the first place.
We're talking about different security models. Whereas anti-cheat companies seek to prevent code execution or introspection in a certain context (the game) by a user that already has full privileges over the machine, other mitigations seek to prevent privilege escalation or initial access. For example (simplification), a remote exploit that relies on a specific address to work would now have to find an additional information leak. Similarly, code running in usermode cannot simply* know the addresses of objects in the kernel.

* In practice this isn't really true, and there are many ways to bypass KASLR

Maybe it should.
I think the current design comes from the fact that calling the kernel is expensive so it's best not to do it too often.

Maybe with the new ways of interacting with the kernel, like io_uring it can be cheaper.

Perhaps it could be provided by a vDSO, that would then decide if and when to call into the kernel proper.
Well, io_uring is cheaper (significantly!) precisely because its for async operations, which can be pipelined. Application logic almost never uses malloc() in asynchronous, pipelineable fashion.
I’d argue that’s exactly what happens, especially during object initialisation in C++.
Editing page mappings is also expensive. I think regardless of the mechanism by which the kernel hands out pages, you will pay that cost.
Still, it would eliminate a whole class of bugs, even with insecure languages such as C.
It may eliminate a whole class of vulnerabilities, not bugs - the bugs would still be there (the program would not behave as expected), but they may not be usable anymore for arbitrary code execution or data smuggling.
The Windows heap seems to have fairly straightforward security features; there doesn't seem to be a lot of randomization other than the heap base. Did I miss something in those two documents?
After 18 allocations on the backend allocator LFH will kick in. The LFH will randomize returned chunks of the same size, doesn't matter if they were freed in the meantime or not. That makes it hard to exploit, you need to groom the heap before LFH kicks in because afterwards the unpredictability becomes too large.

If you change the size of your request you will go either to another bucket or the backend allocator. So that makes type confusions slightly harder if you don't control variably sized buffers.

Even before those 18 allocations, the specific tricks that used to work on the old manager, which is very similar to the behavior of linux kernel apparently, no longer work. It used to be that you could reliably get the same memory address if you allocated a same sized structure immediately after freeing another. But that no longer works.

The only technique that I currently know of is that the merging of free blocks on the backend allocator is still deterministic.

But to use that you must be sure your target has not already triggered LFH for the sizes that you're using while grooming.

Respect for publishing the results of something that didn't work. Now somebody will be able to either look for another type of solution or try to improve this one without wasting time on reimplementing the same thing again.

Good job.

If I'm not mistaken, science publishing doesn't work like this. Only good results are published - successful failures aren't.

This sounds a lot like what OpenBSD malloc has been doing, but via a different mechanism. OpenBSD malloc also tries to avoid handing out the same memory after a free.

https://ftp.openbsd.org/papers/eurobsdcon2009/otto-malloc.pd...

Kernel should adopt something like Chromium's PartitionAlloc. Jann's idea about type is basically the same idea.

https://chromium.googlesource.com/chromium/src/+/refs/heads/...

How is that different from the different caches already present in the linux kernel? For example, `struct cred` is always allocated in the `cred_jar` cache and nothing else goes there.
Oh, I wasn't aware. Yes, it's the same thing.
Jann Horn is a genius. The sheer amount of productivity he has is off the charts.
Sure. But this is Alex Popov's work
Yet the accolades are not misplaced, when calling out a particular contribution

  > And the main kudos go to Jann Horn, who reviewed the
  > security properties of my slab quarantine mitigation and
  > created a counter-attack that re-enabled UAF exploitation
  > in the Linux kernel.
Correct me if I'm wrong, but I believe Kees Cook has been livestreaming some of the patch reviews of this work on Twitch.

https://www.twitch.tv/keescook

Past recordings can be found on youtube: https://www.youtube.com/channel/UC6zmTkbgwe2q6l6TNjABSCg/vid...

EDIT: I realized it's mentioned in the blog post! But more airtime doesn't hurt :)