Hacker News new | ask | show | jobs
by nigeltao 999 days ago
Wuffs author here.

It's not full-blown formal verification but it doesn't have to be. Unlike most formal verification projects that I've seen, the proof part of Wuffs isn't about correctness (proving that "this implementation satisfies the PNG specification"), it's about safety (proving that "this implementation won't read/write out of bounds"), which is much more tractable. Especially as the PNG spec (or WebP spec) doesn't mandate how to treat malicious input (like unbalanced Huffman tables).

Wuffs doesn't have a WebP decoder yet but it's literally in the Roadmap and I've previously written the golang.org/x/image/webp package in the Go programming language.

Wuffs does have a PNG decoder, PNG uses Deflate compression and Deflate and WebP's "decode a Huffman tree" data tables are very similar. Wuffs also has an equivalent of zlib's enough.c program mentioned in the original blog post to calculate worst-case memory requirements. Wuffs' version is called script/print-deflate-huff-table-size.go and it says that, for a 9-bit primary table, we need 852 table entries. Round that up to 1024, for nearest power-of-two.

If you look for HUFFS_TABLE_SIZE = 1024 and HUFFS_TABLE_MASK = 1023 in Wuffs' std/deflate source code, you'll notice that Wuffs' Deflate decoder isn't susceptible to the same problem as the C/C++ WebP decoder the original blog post discussed. This is because, unless the Wuffs compiler can prove otherwise, the code doesn't just look up the table like `this.huffs[0][i]`, it's like `this.huffs[0][i&mask]` and `i&mask` is always in bounds. The mask is either HUFFS_TABLE_MASK or it's of a refinement type guaranteed <= 511. The array is always (statically) allocated with 1024 entries even though enough.c says that, with dynamic allocation, we could possibly get away with a smaller table.

As you already know, for Wuffs (and unlike C/C++), taking out the `&mask` will lead to a compile-time error. Wuffs code won't compile unless the compiler can prove that array indexes are always within bounds.

From the grandparent:

> this WebP bug occurred because the largest table size was formally proven but it didn't match what was fed to the source code

With WebP+enough.c this 'largest table size' was calculated by exhaustive, brute-force search (not really a formal proof) but it was based on assumptions (balanced codes) that didn't match actual (malicious) input. Or, in zlib-the-library, there's other C code (https://github.com/madler/zlib/blob/ac8f12c97d1afd9bafa9c710...) that rejects unbalanced Huffman codes, but IIUC similar enforcement was (until very recently) missing in libwebp.

With Wuffs, even if the worst-case calculation was based on an incorrect model (or forgetting to separately reject unbalanced codes), the end result (on malicious input) might be "the wrong pixels" but it shouldn't be "buffer overflow".

4 comments

Hi Nigel, Thanks for interjecting (and of course Thanks for WUFFS). I'm glad to hear that WebP is "on the Roadmap" but I think this sort of work ought to have been considered by vendors as a must-have, high priority project, rather than being sort of vaguely welcomed sideshow that's not on their critical path.

Yes, I actually think that the choice to unequivocally prove safety rather than to attempt to prove that you implemented something specific (but with the risk that our specification is wrong), is almost invariably the right choice for this problem space. It would not have been appropriate for the TLS 1.3 protocol (which has a machine proof that it satisfies our intended criteria stated in the RFC, modulo the Selfie attack and assuming our cryptographic primitives all do what they said they do), but it's exactly the right thing for say a DOCX parser, or ZIP parser, or a JPEG compressor or similar for which WUFFS is the right choice.

Slightly off-topic but I can't resist the urge to ask.

Why is the logo (https://raw.githubusercontent.com/google/wuffs/main/doc/logo...) no longer shown in the readme?

I was pretty sure I had heard of the project before but I had to find the logo to be absolutely certain, it's truly one of a kind.

The logo is no longer shown because I think the tweet-screenshot is a more compelling introduction for people new to Wuffs.
> With WebP+enough.c this 'largest table size' was calculated by exhaustive, brute-force search (not really a formal proof) but it was based on assumptions (balanced codes) that didn't match actual (malicious) input.

Maybe I'm stretching the definition, but the exhaustive search is also a proof, especially when it can be done quickly or an efficient proof certificate can be generated (enough.c is the former). I even think that enough.c can be modified to generate a reasonably sized trace to aid verification.

As an intellectual 'puzzle', I'd be curious to see what that sort of trace-guided verification could look like, if that's even feasible. My recollection of enough.c is that, algorithmically, it's sufficiently complicated that I need to really think hard when I'm reading the code. But also, it uses clever data structures specifically to reduce the memory requirements at runtime.

In comparison, "array size is always 1024, array index is always bitwise-anded with 1023, therefore always in-bounds" is undeniably simple and, per "The Fastest, Safest PNG Decoder in the World", practical and fast.

While I agree that you don't need the tight bound (especially because I think the tight bound for incomplete trees is not much higher than one for complete trees), you do need to be confident that 1024 is either a loose but correct bound or highly unlikely to be reached in practice. So you need to calculate some kind of bounds anyway... And this is another reason I thought something more than Wuffs is desirable [1]; a dynamic memory allocation, when allowed, makes many things easier to implement and prove.

[1] https://news.ycombinator.com/item?id=37480019

1024 is a loose but correct bound. That's what enough.c (or script/print-deflate-huff-table-size.go) says.

Stepping back, I'm not sure if I understand what you're saying. Perhaps we're just disagreeing on how "formal" (as in, part of a "formal verification" process) the enough.c program (or equivalent) needs to be?

---

As for dynamic memory allocation, Wuffs has a mechanism for the callee (Wuffs code) to tell the caller (C/C++ glue code) that it needs N bytes of "scratch space" or "work buffer", where N is dynamically dependent (e.g. dependent on the image dimensions in pixels).

It's not "dynamic memory allocation" in that Wuffs still doesn't malloc/new and free/delete per se. But it is dynamically (run time, not compile time) sized.

Grepping for "work buffer" or "workbuf" in Wuffs' example/ or std/ code (or SkWuffsCodec.cpp) should give some leads, if you want to study some code. The workbuf_len is often zero but, for Wuffs' std/jpeg and std/png, it is positive. Manage the work buffer in the same way as the pixel buffer: it's always caller-owned and callee-borrowed.

> Stepping back, I'm not sure if I understand what you're saying. Perhaps we're just disagreeing on how "formal" (as in, part of a "formal verification" process) the enough.c program (or equivalent) needs to be?

Ah, I think I misremembered that bit of Wuffs code, specifically around `HUFFS_TABLE_SIZE`. The comment does mention enough.c but I thought `HUFFS_TABLE_SIZE` was also set to the tight bound found by it, which was not the case, and wondered why you seemed to downplay the importance of enough.c verification. I agree that you don't need any additional verification in the current code.

> As for dynamic memory allocation, Wuffs has a mechanism for the callee (Wuffs code) to tell the caller (C/C++ glue code) that it needs N bytes of "scratch space" or "work buffer", where N is dynamically dependent (e.g. dependent on the image dimensions in pixels).

Isn't it fixed during the decoding process though? If my understanding is correct `decode_frame` is not supposed to change what `workbuf_len` will return later (or examples don't account for them yet). Technically speaking you can already use the suspend mechanism to request a memory allocation of any size, so it is more about a matter of public API, but such code would be very regular and repetitive. I thought a limited mechanism to enable an in-situ dynamic allocation---preferably desugared---might improve a usability a lot.

> Isn't it fixed during the decoding process though?

For Wuffs' image decoding, it's flexible up until decode_image_config returns. It's fixed afterwards.

In practice, that works fine for BMP, GIF, JPEG and PNG (and I'm confident will work for WebP too). No in-situ dynamic allocation (with or without sugar) needed. Even though, for the libjpeg-turbo C code, `grep mem..alloc_ jd*.c | wc -l` shows more than 50 "allocation" call sites, Wuffs' JPEG decoder does just fine with just the one work buffer.

That's a neat trick with the index masking. I will have to try using Wuffs again.