Hacker News new | ask | show | jobs
by protomikron 3501 days ago
This is a step in the right direction (in the sense that we should sandbox applications harder), but in my opinion we have to change fundamental aspects of our stack (e.g. Proprietary Firmware <=> Linux <=> GNU-System-Libs <=> X <=> GTK <=> Evince), to gain more security.

In particular I think it is harmful that all applications share the same view on the FS and have in principal the possibility to use e.g. full unixoish capabilities. My bet is that the solution is via better type systems, e.g. an application that is a desktop game could have something like

  exec :: GameConfig -> WindowControl ()
where GameConfig is e.g. some CFG specific to the game and WindowControl is similar to IO () however limited to interacting with a drawing library (e.g. OpenGL) and input systems (keyboard and mouse local to the window).

At the moment every application just implements `main()` and is good to go and we separate between kernel- and user-space (and a VM on top e.g. Android and Apple), and maybe this is too coarse.

I think pledge (http://man.openbsd.org/pledge) is also a step in the right direction however I would prefer it to be the other way around: an application goes through a setup process where it gains the capabilities it needs (in pledge it's the other way around, you ask to drop them).

6 comments

You might be interested in coeffects. Just like monads can be used in a language like Haskell to model effectful operations, you can use the dual of monads, comonads, to model the dual of effects, coeffects!

Coeffects can be used to represent the "context" of a program, which includes things like permissions or capabilities that the program may have access to. They provide a fascinating way of modeling all kinds of information that is traditionally not handled by even powerful type systems like OCaml's or Haskell's.

You can read a lot more about the topic on Tomas Petricek's website: http://tomasp.net/coeffects/

I especially recommend this short article from 2014: http://tomasp.net/blog/2014/why-coeffects-matter/

Thank you, that sounds very interesting.
> limited to interacting with a drawing library (e.g. OpenGL)

Isolation between applications running on the same graphics hardware is rather weak (GPUs don't have something like a MMU), so that exercise is left to the reader ^W driver getting a lot of stuff right. Most don't, or didn't. That's why eg. Qubes doesn't allow sharing a graphics card among domains (well and the fact that the drivers don't support that either), so an untrusted system can only get it's dedicated GPU, with no sensitive data ever going on the same hardware, and the DMA capabilities of the GPU are kept in check by the IOMMU of the CPU. The host only gets involved in blitting the framebuffer somewhere else for display.

I think you're looking for capabilitiy-based security. It can be done at OS and language levels to allow enforcing POLA pretty easily. Here's a page with intro's plus deployment in web browser and GUI prototypes:

http://www.combex.com/tech/index.html

Most prominent language is E:

http://erights.org/index.html

Any solution which requires rewriting existing software is impractical. There's nothing wrong with an application having the illusion of full control of the system that sandboxing provides. I'd like to see an OS where all applications are run in a sandbox (e.g. LXC containers). Each application should have metadata which describes what special access it needs (e.g. Android's permissions) but the user is free to enable or disable these permissions at will. Applications can work as they were originally written because any access they expect to have that was denied by the user can be faked. For example, say an application requires access to a database containing your personal contacts. Instead of blocking the application's access completely and requiring the application to correctly handle the case when accesses was denied, the OS can provide a dummy contacts DB instead. The application then proceeds as normal without knowing access was denied. Firewalling in both directions like SubgraphOS is doing is also essential.
> Any solution which requires rewriting existing software is impractical.

That is true, but there might be a point where we have languages, securer-os, frameworks or libraries, where it is simpler than hardening existing software, and I think it will look similar to typed functional programming.

IMHO sandboxing is popular because OS have failed - or maybe it was just out of scope as we may have underestimated the big range of attack vectors - in isolating processes that may be "evil". Sandboxing at the moment is extremely low-level, if you are paranoid about security you probably sandbox full Linux in some hardened kernel-thing (sel4 and similar approaches - I am not even sure if sel4 is used in production), and that is far from trivial, as you have to bridge the "userspace-style" Linux to the HW, which Linux wants to talk do. If you want to secure Linux from the HW itself (or even provide a better alternative) you have to rewrite large parts of it (drivers) and they are mostly very specific to the Linux API.

Now less paranoid people can use something like Docker/Linux-Containers/... and maybe a combination of libraries and distributions (like Subgraph, etc.), but Docker's isolation security record is controversial. Sure if setup correctly it is probably more secure than plain processes/JVMs (this is also controversial), but it just feels hacky and feels like an afterthought that might not be able to guarantee the security it advocates (I hope I am wrong here).

As a programmer you often know many constraints about your software, that are extremely hard to communicate (currently), so you skip stating this constraints and your software has attack vectors that might be avoidable in the first place.

The well-developed, well-security-researched, well-deployed application platform you're looking for is the web. You get exactly this sort of setup if you use WebGL: you interact with an API that expects to be called by unprivileged hostile applications, instead of with a library that helps your direct access to the graphics card driver. Every individual application lives in a separate protection domain (an HTTP origin), and communication between them is limited to message passing with the consent of both sites. The language itself avoids all assumptions of direct access to system resources.

Running everything in a web app is, admittedly, a fundamental change in the stack. But it's fortunately one where a lot of people have independently put work into making this happen. I do my most security-sensitive work on a Chromebook (using the SSH and mosh apps from the Chrome app store) for precisely this reason: it's the right security model, and it's available in my local computer store and works.

> You get exactly this sort of setup if you use WebGL

> I do my most security-sensitive work on a Chromebook

I would highly recommend you use a WebGL whitelist then. WebGL might have been designed with security in mind, but the OpenGL drivers which it, nevertheless, is a very thin wrapper around were, I can assure you, not written with security in mind. WebGL allows some surprisingly direct ways of manipulating hardware and there are a million attack vectors lurking in every WebGL implementation/OpenGL driver combination.

That's a good point. What else should I whitelist other than WebGL? (Is there a general hardening guide for an off-the-shelf, un-jailbroken Chromebook?)
Video, audio. Complex binary formats that require high performance programming where often security has taken a back seat.
You are right, that is the most secure platform at the moment to distribute graphical user interface programs, but I think it should go further.

E.g. I would go so far, that it shouldn't be possible by default for the server to send me a huge HTML/CSS/JS blob that does all kind of weird stuff (e.g. reporting to the host, mouse movement analysis, etc.).

I am probably in a minority with the following opinion, but I think a page shouldn't even have the ability to enforce a layout which in the end draws pixels on your screen. The web is a step forward and HTML is a good idea, but it is not used anymore in its intended form - it works very well for text distribution, but richer applications have to resort to JS.

Now if you disable JS you could in theory render it as you like, but this is far from trivial.

//edit:

Lets consider a bus company offering search to find offers that get you from A to B (i.e. a route planner, trip finder, ...).

This app shouldn't ship you random HTML/JS, but just the information you need to query its database, which is simply some GETing and POSTing of specified requests. When connecting to the app (going to https://trip-search.example.com) the host could disclose it self as an application having type `(From, Date, To, Date) -> Maybe TripList` or something like that (I think one gets the idea).

The web is great, but I think security should and must go further, I do not want run random Turing machines.

> The web is great, but I think security should and must go further, I do not want run random Turing machines.

Exactly. I want a document to read, not an application to execute. Sadly that battle is feeling more and more lost as time goes by.

I'm not sure I get why enforcing a layout is a problem from the point of view of application distribution - if nothing else, an app should be able to embed a text renderer and draw onto a <canvas> itself. (It's probably a terrible idea, but it should be able to, because a text renderer is just a program that takes in data and outputs some pixels, and that class of programs is useful.)

I do certainly agree that we need a way of distributing hyper-text content efficiently and in a standard way. Unfortunately the web seems to be moving away from that goal, and AMP isn't quite right and has its own problems.

I'm not sure how I feel about permissions by default. I think permission fatigue is definitely a thing, and for most apps I don't actually care about them exfiltrating mouse movements to the host, as long as they can only exfiltrate it to the one host. On the other hand, I'm a little weirded out that if I plug my piano into my Chromebook, JavaScript can receive and send MIDI events without any permission prompt.

EDIT to your edit: I'm totally okay with running random Turing machines, if their execution environment is constrained (which it is). The only resources that an arbitrary Turing-complete programming language can access are any external resources that it's specifically given an interface to, and time/memory/power consumption. The web platform is pretty good (though, yes, not perfect) at locking down the interfaces given to JS. So it's just a matter resource limits, which is fairly easy; I'm not always thrilled with how much CPU and battery life Twitter takes, for instance, but it's always killable. (Again, in theory.)

You can construct something that's capable of using plenty of memory or power out of any sufficiently powerful Turing-incomplete language. See, for instance, CSS. (I bet with the mechanism you're proposing, you can end up chaining server-side APIs in ways that let you DoS the client, because the server is always more powerful.) And given how easy it is to achieve Turing-completeness by mistake, it doesn't seem like a productive constraint.

> I'm not sure I get why enforcing a layout is a problem from the point of view of application distribution - if nothing else, an app should be able to embed a text renderer and draw onto a <canvas> itself.

Yeah, but in my opinion that is already a specific type of application, like e.g. a computer game, PDF viewer, plotting application.

It is totally different from e.g. an application like Wikipedia or a news page, that provides mostly text and images.

In the end there should just be more of the functionality on client side (rules how to render news pages, how to render wikipedia, etc.).

Serious question - what's the difference between that, and running all apps in their own chroot jails?

It seems like the goal of this app is to isolate things from the network, and from each other. A web app or chromebook method isolates from other apps, ok, fine, but not from the web. Seems more like jail in that sense.

Maybe I'm just misunderstanding.

That's a good question! The simple answer is that the web is about whitelisting, whereas a chroot jail is about blacklisting, and blacklisting never works. (Whitelisting, to be clear, also has no guarantee of working, but at least it's possible for it to work.)

When you jail a UNIX process, you start from a model that gives you full access to everything, and gradually revoke access until you're convinced it's secure. There are all sorts of things you might overlook. For instance, if it's just a chroot, there's no network isolation; an app can connect to a server listening on localhost, and it looks like it's coming from localhost. It can connect to a server on the local network, and it looks like it's coming from the host (which is bad if you have, e.g., a corporate network that lets you access interesting data without logging in, or a home router with a default admin password, or many similar cases).

And if you introduce a new mechanism, the chroot probably gives you access to it. For instance, if the chrooted app is able to access my X11 session, it has a ton of powers; it can keylog everything I do, for instance. Even if I mark it "untrusted" a la ssh -X, it has complete powers over everything else that's "untrusted". You could imagine an X11 designed differently, but X11 was designed for trusted apps. Another important case is system calls; a chrooted process has access to every system call, including every vulnerability that might be present. (On some OSes you can restrict what system calls the process can run, but it's still pretty coarse-grained.)

The web starts from the ability to render formatted text with links, which is very close to zero. Everything else is—at least in theory—added from there when safe. Images are safe. Playing audio is pretty safe. Recording audio is probably not safe without permission. (A typical desktop API won't have an easy way to allow one but not the other, and certainly won't have a permission prompt.) Rendering graphics is fine. Rendering 3D graphics is potentially fine, hence WebGL. Rendering graphics on top of someone else's tab is a definite no. Moving your window around or removing its borders is also a definite no. Becoming full-screen requires notifying the user of what just happened. (Again, a typical desktop API won't distinguish these cases and won't give you an easy way to exit full-screen.)

In particular, the web does restrict an app's ability to access the web. An app can freely access its own origin, but it cannot freely access other sites. If http://wiki.internal/ has sensitive data that doesn't require login, a site on the public web cannot retrieve data from there, without the consent of that site. (And the web has already implemented a pretty robust and involved way of handling cross-origin resource sharing.)

If you stick all these things into a desktop API, fantastic! But the web platform is already there, with a number of competing implementations that are all pretty good.

You might be interested in object-capability model[0] systems. It comes from the idea that in most memory-safe languages, before you can call a function or a method on an object, you first need to get a reference to it passed to you first. You can easily determine what code operates on an object by looking to where the object is passed. Now imagine if all types of IO interactions followed a similar system.

Right now, most languages have "ambient authorities", references with imbued authority (IO capabilities, etc) that can be obtained by any code anywhere in the program. In nodejs, any code can use the globally-available `require('fs')` call to get a reference to the filesystem module and then use it to make changes to the filesystem freely; the filesystem module is an ambient authority.

In a hypothetical object-capability version of nodejs, `require('fs')` would be invalid, and instead the application could have a single entry-point main function which receives the filesystem module as one of its parameters. In order to use functions that need to use the filesystem module, the main function would have to pass a reference to the filesystem module, or even a different object that follows the same interface. If it's known the function-to-be-called should only need to read files, then the function could be passed a wrapped version of the filesystem module that has all of its writing methods stubbed out for ones that throw errors instead. You can easily sandbox applications on a very granular level by passing them the minimum number of IO authority-imbued objects, and it's easy to review the security of code for looking where IO objects are passed around.

Currently I think Haskell (with unsafe code disabled) is the closest thing to an object-capability language that's popular right now. Some of the terminology doesn't match up -- code doesn't get a reference to an IO monad to do IO, instead it must return an IO monad which gets mixed into the IO monad returned by the main function to take effect -- but I think many benefits come out about the same. There's no ambient authorities. You can follow the control flow to isolate the parts of the code that do IO. I'm not sure if it's possible in Haskell to do the equivalent of passing a restricted capability so easily; can you call an IO-monad-returning function (that was written without any sandboxing in mind) in a way that it's not allowed to write files?

There are existing popular capability systems, but they're not as full as object-capability systems. They have object-capability-qualities, but only at the edges. A linux process can ask the OS to open a file and get a file handle, it can start a child process as another user or sandbox it in other ways so that the child is restricted from opening files itself, the parent can pass an individual file handle to the restricted child, etc. But outside of that specific file handling, the code of those processes isn't necessarily written in a very capability-style way. The child process may be written in C, it could put the file handle into a global variable, and any function inside its code could refer to that global variable. In an object-capability language, everything about the program's code follows the authority-comes-from-given-references object-capability style.

[0] https://en.wikipedia.org/wiki/Object-capability_model

We posted same solution about the same time! Haha. Yeah, this stuff is pretty easy in capabilities. They can be extended further with a high-level, systems language. I'd like to see something like SPARK or IDRIS with capability-security built-in along lines of E language. Just something that isn't built on Java.
This is also interesting. I expect there is some connection between this and the mentioned effect systems, at least their goals seem to overlap.