| Sorry, just an additional question as this project is so interesting: Does Hermit support running different processes in parallel in certain situations? It seems like this should be possible to do as long as they are appropriately isolated from each other. Specifically, if two processes don't share any writable memory mappings then I'm thinking it might be possible for Hermit to exploit the underlying existing isolation between them so that they could run code in-between system calls in parallel (while still making sure that the actual system calls happen in a deterministic order). Perhaps it would even be possible for the two processes to execute system calls in parallel if they are unrelated and they do not affect deterministic execution (such as if they are doing I/O to different files or directories, or one process doing I/O while another is doing something completely unrelated like getting the time, etc). Although I guess this latter point (of executing syscalls in parallel) would require doing rewind/replay because it wouldn't be possible to know in advance whether the system calls are going to be related or not, as the two processes might not do the syscalls at exactly the same time. Is Hermit doing this (executing code in-between syscalls in parallel, at least)? Or do you think it would be possible to do it? This could significantly increase the usefulness of Hermit for distros that want to do deterministic builds of packages, as most compilation could happen in parallel / i.e. use several cores at the same time! I'm thinking about huge package builds such as Chromium, Firefox, the Linux kernel, etc. which would perhaps take days to build if they were completely serialized into a single CPU core. |
Our earlier (dettrace) prototype allowed syscall-free regions in separate processes to run physically in parallel. Hermit actually hasn't added any process parallelism yet, but it's designed to actually go further than dettrace in this respect.
Specifically, Hermit is architected so that the thread-local syscall handler "checks out" resources from the central scheduler. Resources include things like paths on the file system, contents of files, shared memory, and permission to perform external side effects. Right now, all requests wait for the scheduler to background all other threads and make the current thread the only runnable. But the idea is: in the future we will keep the semantical identical log of linear "commits" (linearization), but will simply background the current thread while it uses the resources they checked out, move forward to the next scheduler iteration, and only block the next runnable thread until its requested resources are freed, not until ALL other threads have finished their timeslice and gone back to waiting on the scheduler.