Hacker News new | ask | show | jobs
by wicket 700 days ago
> Syd is a user space wrapper

This sentence does not give me confidence. A general purpose sandbox should be implemented in the kernel. The kernel syscall interface will always remain available regardless of any sandbox that has been implemented in user space.

EDIT (to answer a few of the replies here):

The description lacks details so it's not entirely clear how Syd is intended to be used or who its target audience is. If this is intended for a security conscious user to wrap their own executables, similar to Firejail, I guess this serves a purpose but I would certainly hesitate to suggest that it is the "most sophisticated sandbox for Linux". We already have kernel-based SELinux, AppArmor, other LSMs and grsecurity which can ensure that every single executable will run in a sandbox regardless of the experience of the end-user, something a user space wrapper cannot achieve. It's not about whether or not it uses kernel facilities but about ensuring that absolutely everything will be sandboxed.

3 comments

With seccomp, you have syscall filtering. The bits that are left exposed are largely around mm, and depending on how the sandbox works, non-syscall APIs like io_uring. It essentially actuates kernel APIs to sandbox, and then redirects a number of APIs to use userspace reimplementations.
I'm not sure this is correct, I suspect this works on a similar principal as something like gvisor where as I understand it syscalls are redirected to another userspace program. In gvisors case the kernel basically get's re-implemented in user-space to provide the secure container like implementation.

Also, as I recall some of the kernel based syscall based sandboxing have had a number of issues with dealing with some of the syscalls.

It's a bit non-obvious from that description but Syd does in fact use kernel facilities to do the sandboxing. A sibling comment links to some better documentation (the syd man page) that explains what it uses https://man.exherbolinux.org/syd.1.html

1) Seccomp, a BPF based kernel filter for syscalls

2) Bind mounts inside a filesystem namespace to control what files are visible

3) Landlock - more path restriction type stuff and permission changes to paths that are local to the application being wrapped

4) seccomp-notify (used with ptrace to inspect types of arguments to syscalls that bpf isn't allowed to access for security reasons, i.e. pointers)