| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by surfer7837 1692 days ago
	How can you protect yourself from file upload threats? It's basically the worst possible threat model -- executing complex user input that conforms to a spec that was written 20 years ago by some proprietary company with no security. Executing everything on an isolated container with no permissions? Audit trial etc/good logging? If someone comes up with an RCE you're basically done for, you can only mitigate it but not completely stop it.

7 comments

JoshTriplett 1692 days ago

If you have to process it at all, do it in a WebAssembly sandbox on the server. Or, alternatively, in a seccomp-secured sandbox that isn't allowed to make any system calls whatsoever, just read data from one file descriptor and write processed data to another.

link

Someone1234 1692 days ago

I've seen companies use Headless Chrome and then WebAssembly to process files. You then lock down the Headless Chrome process. You're then "triple covered"; WebAssembly's limited context, JavaScript engine's limited context, and the Chrome process boundary itself.

This is obviously "expensive" though. Doesn't scale very well.

link

magicalhippo 1692 days ago

> This is obviously "expensive" though. Doesn't scale very well.

Unlike this issue then, going by the 1Tbps attack it's reportedly causing...

link

wheresmycraisin 1692 days ago

.... why webassembly?

link

JeremyNT 1691 days ago

Yeah, I don't see the value here either. You don't need wasm or chrome or any of that stuff.

Linux itself has several features that can be used to isolate processes, and there are use friendly tools like bwrap [0] that make configuration easy.

It should be entirely possible to sandbox something like ExifTool itself such that it has no network access and is limited to reading and writing files in a particular directory.

https://wiki.archlinux.org/title/Bubblewrap

link

JoshTriplett 1690 days ago

Several reasons:

- It's a separate interface with a different attack surface than your system, so compared to a locked-down version of the normal syscall API, it provides better defense-in-depth.

- It's designed to be a fully self-contained sandbox, by default. If you're locking down everything but reading and writing previously opened file descriptors, you can build a secure sandbox atop syscalls fairly easily. If you need more nuance than that, WebAssembly seems more likely to remain secure, while syscall sandboxes seem more likely to fail-insecure if you get a detail wrong.

- It seems easier to sandbox otherwise-unmodified code that way. If you have code that needs some access to system resources, I think WebAssembly makes it easier to give it just what it needs and nothing else.

(Also, note that I'm not talking about running in a browser; I'm talking about standalone WebAssembly runtimes like wasmtime.)

link

stefan_ 1692 days ago

The first step is always "don't do it at all". Here is the original commit:

https://gitlab.com/gitlab-org/gitlab-workhorse/-/commit/8656...

It's hard to find a linked detailed requirement for this. I would certainly prefer if GitLab didn't silently mangle uploaded images (not least if I'm working on an EXIF library..).

Bonus points for a commit that includes the words "perl" and "exec" not also having a detailed security review attached.

link

armchairhacker 1692 days ago

This seems like a great use case for formal methods. e.g. in this case EXIF removers which are formally verified to not crash and successfully remove the identifying data.

These types of programs are relatively simple, and this is a case where a formal proof is much better than reliability.

Is anyone aware of research on this?

link

SahAssar 1692 days ago

The most straightforward answer is to not process the upload at all, treat it as a binary blob. As for serving it as an image etc. on your site have a strict CSP and turn off mime sniffing (and don't allow SVG uploads as images).

link

marcosdumay 1692 days ago

You know, if you to it in a pure Haskell function, you can be assured that the worst it can do is to use too many resources so it kill its own process. If you do it in a Rust function, well, you have no formal guarantees, but you have to get really out of your way to put a vulnerability like that in the code.

What you don't do is pulling an ages old perl codebase to run over complex formats.

link

tialaramex 1692 days ago

If you must Wrangle Untrusted File Formats you should do so Safely:

https://github.com/google/wuffs

link

baggy_trough 1691 days ago

I do it inside a systemd nspawn container with a volatile file system, no network, minimal caps.

link