Tale of two hypervisor bugs – Escaping from FreeBSD bhyve | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	Tale of two hypervisor bugs – Escaping from FreeBSD bhyve (phrack.org)
	102 points by 2510c39011c5 2264 days ago

5 comments

saagarjha 2264 days ago

> The below patch fixed the issue:

          struct {
                  uint8_t         dac_state;
  -               int             dac_rd_index;
  -               int             dac_rd_subindex;
  -               int             dac_wr_index;
  -               int             dac_wr_subindex;
  +               uint8_t         dac_rd_index;
  +               uint8_t         dac_rd_subindex;
  +               uint8_t         dac_wr_index;
  +               uint8_t         dac_wr_subindex;
                  uint8_t         dac_palette[3 * 256];
                  uint32_t        dac_palette_rgb[256];
          } vga_dac;

> The VGA device emulation in bhyve uses 32-bit signed integer as DAC Address Write Mode Register and DAC Address Read Mode Register. These registers are used to access the palette RAM, having 256 entries of intensities for each value of red, green and blue. Data in palette RAM can be read or written by accessing DAC Data Register.

> After three successful I/O access to red, green and blue intensity values, DAC Address Write Mode Register or DAC Address Read Mode Register is incremented automatically based on the operation performed. Here is the issue, the values of DAC Address Read Mode Register and DAC Address Write Mode Register does not wrap under index of 256 since the data type is not 'uint8_t', allowing an untrusted guest to read or write past the palette RAM into adjacent heap memory.

Ugh, this looks like an ugly patch :( How about not letting the index overflow in the first place?

> Though FreeBSD does not have ASLR

Why not?!

EvanAnderson 2264 days ago

re: "How about not letting the index overflow..."

I'm fairly certain a real hardware VGA implementation would maintain that index as an unsigned 8-bit value. There's no letting the index overflow, per se. It should overflow back to zero when incremented at 0xFF. The mistake was representing it as anything other than an 8-bit number because that's what the hardware really would have done.

If I had the patience (and a machine close at hand) I'd boot MS-DOS and run thru a little test in DEBUG to see how it acts on a real card to write beyond address 0xFF via the auto-increment feature.

Most of the palette-cycling / setting code I ever wrote just updated the entire palette at once and stopped. I can't think of any particular reason why somebody would want to overflow the DAC write index but it should work. (I suppose there's probably some demoscene code out there somewhere that repeatedly updates the entire palette, outputting to the address write register once then just banging on 0x3c9 repeatedly...)

userbinator 2264 days ago

I'm fairly certain a real hardware VGA implementation would maintain that index as an unsigned 8-bit value.

I've found an authoritative source - a datasheet for an IBM RAMDAC used for their VGA-compatible video cards:

ftp://retronn.de/docs/pc_hardware/DACs/IBM37RGB524.pdf

    An increment past 0xff will "wrap around" to 0x00.

I suppose there's probably some demoscene code out there somewhere that repeatedly updates the entire palette, outputting to the address write register once then just banging on 0x3c9 repeatedly...

I haven't looked, but definitely check the "tiny" (512b and below) category if you want to find tricks like these. I'm also reasonably certain there's going to be at least one which does that for a palette animation effect, taking advantage of the wraparound so it won't have to reset the write index again (and thus saving some precious bytes.)

Here's an interesting discussion I found about the behaviour of the DAC ports on actual hardware, also from an emulation perspective (DOSbox):

https://github.com/joncampbell123/dosbox-x/issues/502

msla 2264 days ago

> If I had the patience (and a machine close at hand) I'd boot MS-DOS and run thru a little test in DEBUG to see how it acts on a real card to write beyond address 0xFF via the auto-increment feature.

Good excuse to load up a bunch of old games and see how they work, or fail to. The old Raymond Chen test: Grab a bunch of old software and let the application programmers bang on the emulation layer. Abandonware and old shareware on archive.org makes this a lot cheaper than you might imagine.

https://archive.org/details/softwarelibrary_msdos

userbinator 2264 days ago

Don't forget all the demoscene productions --- they'll use all the tricks in book, and then some. They'll be the first to break if the emulation of the hardware isn't perfect.

This masterpiece is a great example (but targets hardware which is older than what the hypervisor in the article can emulate): https://news.ycombinator.com/item?id=16938029

saagarjha 2263 days ago

> The mistake was representing it as anything other than an 8-bit number because that's what the hardware really would have done.

I would be significantly more pleased if this was the documented, intended behavior that was being emulated intentionally, since the previous choice leaves me a bit worried that this is just an instance of “hey we can just use this type as a convenient index that doesn’t go out of bounds”.

EvanAnderson 2263 days ago

The sibling comment to yours has a link to such documentation, and a link to a really nice discussion re: reverse engineering this behavior on various clone VGA chipsets. The IBM RAMDAC documentation is pretty authoritative, IMO.

int_19h 2264 days ago

This doesn't let the index overflow with the minimum amount of hassle. The behavior of unsigned overflow is well-defined in C, so there's no point writing code with an explicit check that is tricky to do correctly.

saagarjha 2264 days ago

I don't know, I think it's kind of messy to rely on overflow behavior to ensure that your element is in bounds. What if the size of the array changes later? There's no clear linkage between the two in my mind.

tonyarkles 2264 days ago

Keep in mind that this is emulating old hardware. See: http://www.osdever.net/FreeVGA/vga/colorreg.htm If the size of the array changed, it would no longer be compatible with the hardware it's trying to emulate. I understand what you're trying to say and agree in general, but this is a good example of "the exception to the rule".

saagarjha 2263 days ago

I guess I’m unhappy with the linkage between “this is an 8-bit value” and “this thing just happens to wrap at 256”; I would be much more satisfied if there was a clear “byte” rationale for the latter as well, which there very well might be but it’s not clear contextually.

To put it differently, I’m fine with relying on the overflow behavior, but only when the semantics have an actual match. For example, I rely on it for the implementations 32-bit operations in a VM I’m writing, but I feel this is OK because the spec for the architecture clearly defines this as the intended behavior and it’s not just “this used to be an int but I realized the array was 256 big so I can abuse an unsigned byte to never have an invalid index”.

int_19h 2263 days ago

The array is 256 entries big, because it's a palette for an 8-bit VGA color mode, where each pixel is an index into it. The link is very obvious and natural here.

de_watcher 2263 days ago

No, it's natural. The conventions are different in different areas of software development so it may look weird to different groups.

saagarjha 2263 days ago

I’m not sure what group I’d need to be in to think of this patch as natural without qualification…

simcop2387 2264 days ago

Time it looks like. Future versions are getting it apparently.

https://wiki.freebsd.org/ASLR

bonzini 2264 days ago

What privileges does bhyve run under? Exploiting a properly sandboxed QEMU does give you access to some potentially interesting file descriptor but, unless you can use them to get kernel code execution, your process will not have access to any resources on the host that wouldn't be already accessible from inside the VM.

monocasa 2264 days ago

Almost all attacks these days are a chain of exploits that each focus on jumping code execution to a new context or gaining a new privilege, that only when combined give you the access you're looking for as an attacker.

yjftsjthsd-h 2264 days ago

Doesn't running it with KVM enabled put you back into kernel space? I'm sure doing everything in userspace is safer, but TCG is nowhere near as performant...

bonzini 2263 days ago

KVM is indeed one of the potentially interesting file descriptors, but there is relatively little code running in kernel space.

monocasa 2264 days ago

The vast majority of the device emulation is in user space, even with KVM.

saagarjha 2264 days ago

There's a section on sandbox escapes.

saagarjha 2264 days ago

But

> It is disabled by default.

Why?

yjftsjthsd-h 2264 days ago

FreeBSD has a very strong commitment to stability and making things work in new releases that working in old releases. This results in fairly conservative defaults.

moonchild 2264 days ago

"Regression in FreeBSD 12.1-RELEASE-p4: the exploit I was using in -p3 is no longer usable."

simcop2387 2264 days ago

Judging by the fact that at least ntpd gets broken by it I suspect it's again more a time issue. It'll take time to make sure that anything broken by it either gets fixed or properly documented as needing a work-around. Otherwise when it first ships it's going to break a lot of systems when people switch.

loeg 2260 days ago

It is trivially bypassed and has some negative performance impact. So why enable it by default? It is left as an option for the paranoid and for "checkbox compliance" type applications.

saagarjha 2260 days ago

Trivially bypassed if you have an address leak…

loeg 2260 days ago

No. Trivially bypassed without a leak. Two of many examples:

https://dl.acm.org/doi/10.5555/3195638.3195686

https://www.vusec.net/projects/anc/

New timing attacks that break ASLR come out ~annually. These are hardware mechanisms that cannot be mitigated by software. ASLR is broken. (Nevermind ROP gadget compilers, etc.)

landr0id 2264 days ago

HardenedBSD (FreeBSD fork) has had ASLR and other mitigations since forever. Shawn submitted a patch that was never merged because of mailing list politics or something of that sort + people afraid it was going to break the world.

https://reviews.freebsd.org/D473

cperciva 2264 days ago

> politics or something of that sort

I suppose you could classify "the patch doesn't work and breaks other things besides" that way.

landr0id 2264 days ago

HardenedBSD got the userland running fine. It was iirc mostly irrational fear by the FreeBSD team that someone's bad application would break.

loeg 2260 days ago

Nope. The code had major problems that were highlighted during code review, and Shawn never fixed them.

gonzo 2264 days ago

That’s not what happened.

bArray 2264 days ago

I really wish phrack.org had an RSS feed so I can keep on top of their latest papers.

heavyset_go 2264 days ago

So do I!

takeda 2263 days ago

Any reason why phrack talks about 2016 vulnerability 4 years later? I was thinking that once security advisory is published it is a fair game.

lloeki 2263 days ago

Wondering if it directly or indirectly affects xhyve and Docker’s HyperKit.

takeda 2263 days ago

If they are based on it, the chances are high that they also are affected, but the vulnerabilities are somewhat old so probably already patched.

monocasa 2264 days ago

Interestingly this is the kind of thing Rust is great at protecting against, and why Firecracker is such a neat project.

kccqzy 2264 days ago

How does Rust prevent you from choosing the wrong size for an integer? It's easy to imagine a bug in Rust code in which someone is supposed to write "u8" as in this case, but didn't think carefully and just used "usize" (the most typical integer type used for indices).

monocasa 2264 days ago

The palette memory would be bounds checked, which is the real bug.

saagarjha 2264 days ago

It would prevent the out-of-bounds read and write, presumably.

justincormack 2264 days ago

Firecracker also doesn't have this sort of complex device emulation such as VGA, only minimal virtio devices.

monocasa 2264 days ago

At the moment.

dathinab 2264 days ago

And in the future. The project explicitly does not want to provide anything beyond the things required for it's use case (mostly non computation intensive course servers like used for serverless computing platforms). Part of it's upperformance comes from the thinks they don't support.

(Edit: as far as I know)

monocasa 2264 days ago

They're in the process of pulling the core out to share with VMMs that don't take the same asceticism, but to still allow what they at Amazon ship to be a small attack surface. This work is in the rust-vmm project.