Hacker News new | ask | show | jobs
by comex 1299 days ago
Nice. Some questions:

- How does this compare with rr?

- Out of curiosity, have you ever looked at libTAS? (I realize it has a very different intended use case.)

- Have you had an issues with behavior differences between CPUs? I know there is architecture-level undefined behavior that can differ between CPUs; on the other hand, it sounds like you’re primarily interested in running well-behaved executables that wouldn’t intentionally invoke such behavior.

2 comments

- We've taken a lot of inspiration from RR and it is very good at what it does (record/replay + randomizing thread schedules). This project diverges a bit from RR in that we focus more on the concurrency testing application of determinism. I wrote the toy record/replay system that sits on top of the determinism layer (so that we don't have to record as much stuff), but it's a long way from being complete.

- I haven't seen libTAS until now, so I can't comment on it much. At first glance, it does look similar!

- See @rrnewton's reply on this one.

(1) rr [formerly Mozilla rr]

We're big fans of rr!

Hermit is different in that creating a deterministic OS semantics is different than recording whatever nondeterministic behavior occurs under normal Linux. BUT, there's a lot of overlap. And indeed `hermit record` is straight up RnR (record & replay).

But hermit for RnR but is not nearly as developed as rr. We integrate with gdb/lldb as an (RSP) debugger backend, just like rr. Any failing execution you can create with hermit, you can attach a debugger. But our support is very preliminary, and you'll probably find rough edges. Also, we don't support backwards stepping yet (except by running again).

If we invest more in using Hermit as a debugger (rather than for finding and analyzing concurrency bugs), then there should be some advantages over traditional RnR. These would relate to the fact that deterministically executing is different than recording. For example, process and thread IDs, and memory addresses all stay the same across multiple runs of the program, even as you begin adding printfs and modifying the program to fix the bug. With traditional RnR, you can play the same recording as many times as you like, but as soon as you take a second recording all bets are off wrt what is the same or different compared to the prior recording. (That includes losing the "mental state" of things like tids & memory addresses, which is a good point Robert O Callahan makes about the benefits of RnR when accessing the same recording multiple times.)

(2) libTAS - no we haven't! Checking it out now.

(3) Yes, definitely issues with CPU portability.

In general, we are interested in not just determinism on the same machine, but portability between machines in our fleet. As with any tech company that uses the cloud, at Meta people are usually trying to debug an issue on a different machine than where the problem occurred. I.e. taking a crash from a production or CI machine to a local dev machine.

The way we do this is that we mostly report a fairly old CPU to the guest, which disables certain features IF the guest is well behaved.

With the current processor tech, I don't think there's any way we can stop an adversarial program, which, for example, would execute CPUID, find that RDRAND is not supported on the processor, but then execute RDRAND anyway. We could build a much more invasive binary-instrumentation based emulator that would be able to enforce these kinds of rules at the instruction granularity, but it would have higher overhead, especially startup overhead. The nice thing about Reverie though is that we (or others) can add different instrumentation backends while keeping the same programming instrumentation API. So we could have a "hardened" backend that was more about sandboxing and reverse-engineering adversarial software, making a different tradeoff with respect to performance overhead.

Very cool to see the stuff you e been working on become public!