Hacker News new | ask | show | jobs
by saagarjha 2264 days ago
> The below patch fixed the issue:

          struct {
                  uint8_t         dac_state;
  -               int             dac_rd_index;
  -               int             dac_rd_subindex;
  -               int             dac_wr_index;
  -               int             dac_wr_subindex;
  +               uint8_t         dac_rd_index;
  +               uint8_t         dac_rd_subindex;
  +               uint8_t         dac_wr_index;
  +               uint8_t         dac_wr_subindex;
                  uint8_t         dac_palette[3 * 256];
                  uint32_t        dac_palette_rgb[256];
          } vga_dac;
> The VGA device emulation in bhyve uses 32-bit signed integer as DAC Address Write Mode Register and DAC Address Read Mode Register. These registers are used to access the palette RAM, having 256 entries of intensities for each value of red, green and blue. Data in palette RAM can be read or written by accessing DAC Data Register.

> After three successful I/O access to red, green and blue intensity values, DAC Address Write Mode Register or DAC Address Read Mode Register is incremented automatically based on the operation performed. Here is the issue, the values of DAC Address Read Mode Register and DAC Address Write Mode Register does not wrap under index of 256 since the data type is not 'uint8_t', allowing an untrusted guest to read or write past the palette RAM into adjacent heap memory.

Ugh, this looks like an ugly patch :( How about not letting the index overflow in the first place?

> Though FreeBSD does not have ASLR

Why not?!

3 comments

re: "How about not letting the index overflow..."

I'm fairly certain a real hardware VGA implementation would maintain that index as an unsigned 8-bit value. There's no letting the index overflow, per se. It should overflow back to zero when incremented at 0xFF. The mistake was representing it as anything other than an 8-bit number because that's what the hardware really would have done.

If I had the patience (and a machine close at hand) I'd boot MS-DOS and run thru a little test in DEBUG to see how it acts on a real card to write beyond address 0xFF via the auto-increment feature.

Most of the palette-cycling / setting code I ever wrote just updated the entire palette at once and stopped. I can't think of any particular reason why somebody would want to overflow the DAC write index but it should work. (I suppose there's probably some demoscene code out there somewhere that repeatedly updates the entire palette, outputting to the address write register once then just banging on 0x3c9 repeatedly...)

I'm fairly certain a real hardware VGA implementation would maintain that index as an unsigned 8-bit value.

I've found an authoritative source - a datasheet for an IBM RAMDAC used for their VGA-compatible video cards:

ftp://retronn.de/docs/pc_hardware/DACs/IBM37RGB524.pdf

    An increment past 0xff will "wrap around" to 0x00.
I suppose there's probably some demoscene code out there somewhere that repeatedly updates the entire palette, outputting to the address write register once then just banging on 0x3c9 repeatedly...

I haven't looked, but definitely check the "tiny" (512b and below) category if you want to find tricks like these. I'm also reasonably certain there's going to be at least one which does that for a palette animation effect, taking advantage of the wraparound so it won't have to reset the write index again (and thus saving some precious bytes.)

Here's an interesting discussion I found about the behaviour of the DAC ports on actual hardware, also from an emulation perspective (DOSbox):

https://github.com/joncampbell123/dosbox-x/issues/502

> If I had the patience (and a machine close at hand) I'd boot MS-DOS and run thru a little test in DEBUG to see how it acts on a real card to write beyond address 0xFF via the auto-increment feature.

Good excuse to load up a bunch of old games and see how they work, or fail to. The old Raymond Chen test: Grab a bunch of old software and let the application programmers bang on the emulation layer. Abandonware and old shareware on archive.org makes this a lot cheaper than you might imagine.

https://archive.org/details/softwarelibrary_msdos

Don't forget all the demoscene productions --- they'll use all the tricks in book, and then some. They'll be the first to break if the emulation of the hardware isn't perfect.

This masterpiece is a great example (but targets hardware which is older than what the hypervisor in the article can emulate): https://news.ycombinator.com/item?id=16938029

> The mistake was representing it as anything other than an 8-bit number because that's what the hardware really would have done.

I would be significantly more pleased if this was the documented, intended behavior that was being emulated intentionally, since the previous choice leaves me a bit worried that this is just an instance of “hey we can just use this type as a convenient index that doesn’t go out of bounds”.

The sibling comment to yours has a link to such documentation, and a link to a really nice discussion re: reverse engineering this behavior on various clone VGA chipsets. The IBM RAMDAC documentation is pretty authoritative, IMO.
This doesn't let the index overflow with the minimum amount of hassle. The behavior of unsigned overflow is well-defined in C, so there's no point writing code with an explicit check that is tricky to do correctly.
I don't know, I think it's kind of messy to rely on overflow behavior to ensure that your element is in bounds. What if the size of the array changes later? There's no clear linkage between the two in my mind.
Keep in mind that this is emulating old hardware. See: http://www.osdever.net/FreeVGA/vga/colorreg.htm If the size of the array changed, it would no longer be compatible with the hardware it's trying to emulate. I understand what you're trying to say and agree in general, but this is a good example of "the exception to the rule".
I guess I’m unhappy with the linkage between “this is an 8-bit value” and “this thing just happens to wrap at 256”; I would be much more satisfied if there was a clear “byte” rationale for the latter as well, which there very well might be but it’s not clear contextually.

To put it differently, I’m fine with relying on the overflow behavior, but only when the semantics have an actual match. For example, I rely on it for the implementations 32-bit operations in a VM I’m writing, but I feel this is OK because the spec for the architecture clearly defines this as the intended behavior and it’s not just “this used to be an int but I realized the array was 256 big so I can abuse an unsigned byte to never have an invalid index”.

The array is 256 entries big, because it's a palette for an 8-bit VGA color mode, where each pixel is an index into it. The link is very obvious and natural here.
No, it's natural. The conventions are different in different areas of software development so it may look weird to different groups.
I’m not sure what group I’d need to be in to think of this patch as natural without qualification…
Time it looks like. Future versions are getting it apparently.

https://wiki.freebsd.org/ASLR

What privileges does bhyve run under? Exploiting a properly sandboxed QEMU does give you access to some potentially interesting file descriptor but, unless you can use them to get kernel code execution, your process will not have access to any resources on the host that wouldn't be already accessible from inside the VM.
Almost all attacks these days are a chain of exploits that each focus on jumping code execution to a new context or gaining a new privilege, that only when combined give you the access you're looking for as an attacker.
Doesn't running it with KVM enabled put you back into kernel space? I'm sure doing everything in userspace is safer, but TCG is nowhere near as performant...
KVM is indeed one of the potentially interesting file descriptors, but there is relatively little code running in kernel space.
The vast majority of the device emulation is in user space, even with KVM.
There's a section on sandbox escapes.
But

> It is disabled by default.

Why?

FreeBSD has a very strong commitment to stability and making things work in new releases that working in old releases. This results in fairly conservative defaults.
"Regression in FreeBSD 12.1-RELEASE-p4: the exploit I was using in -p3 is no longer usable."
Judging by the fact that at least ntpd gets broken by it I suspect it's again more a time issue. It'll take time to make sure that anything broken by it either gets fixed or properly documented as needing a work-around. Otherwise when it first ships it's going to break a lot of systems when people switch.
It is trivially bypassed and has some negative performance impact. So why enable it by default? It is left as an option for the paranoid and for "checkbox compliance" type applications.
Trivially bypassed if you have an address leak…
No. Trivially bypassed without a leak. Two of many examples:

https://dl.acm.org/doi/10.5555/3195638.3195686

https://www.vusec.net/projects/anc/

New timing attacks that break ASLR come out ~annually. These are hardware mechanisms that cannot be mitigated by software. ASLR is broken. (Nevermind ROP gadget compilers, etc.)

HardenedBSD (FreeBSD fork) has had ASLR and other mitigations since forever. Shawn submitted a patch that was never merged because of mailing list politics or something of that sort + people afraid it was going to break the world.

https://reviews.freebsd.org/D473

> politics or something of that sort

I suppose you could classify "the patch doesn't work and breaks other things besides" that way.

HardenedBSD got the userland running fine. It was iirc mostly irrational fear by the FreeBSD team that someone's bad application would break.
Nope. The code had major problems that were highlighted during code review, and Shawn never fixed them.
That’s not what happened.