Hacker News new | ask | show | jobs
by compsciphd 793 days ago
We built in almost 2 decades ago now (including the ability to scrub to a point in the past and resume execution from there)

http://www.cs.columbia.edu/~orenl/papers/sosp07-dejaview.pdf

Abstract: As users interact with the world and their peers through their computers, it is becoming important to archive and later search the information that they have viewed. We present DejaView, a personal virtual computer recorder that provides a complete record of a desktop computing experience that a user can playback, browse, search, and revive seamlessly. DejaView records visual output, checkpoints corresponding application and file system state, and captures displayed text with contextual information to index the record. A user can then browse and search the record for any visual information that has been displayed on the desktop, and revive and interact with the desktop computing state corresponding to any point in the record. DejaView combines display, operating system, and file system virtualization to provide its functionality transparently without any modifications to applications, window systems, or operating system kernels. We have implemented DejaView and evaluated its performance on real-world desktop applications. Our results demonstrate that DejaView can provide continuous low-overhead recording without any user noticeable performance degradation, and allows browsing, search and playback of records fast enough for interactive use.

3 comments

Did you actually build it or just write a paper? Where can I download it?
Previously commented with some other details.

TL;DR: It's an old PhD project that requires custom patches to an ancient kernel version. So I guess you can't download it, and even if you could it wouldn't work on any system you'd want to use today:

> compsciphd on Oct 11, 2022 | parent | context | favorite | on: Linux NILFS file system: automatic continuous snap...

> we used NILFS 15 years ago in dejaview - https://www.cs.columbia.edu/~nieh/pubs/sosp2007_dejaview.pdf

> We combined nilfs + our process snapshotting tech (we tried to mainline it, but it didn't go, but many of the concepts ended up in CRIU though) + our remote display + screen reading tech (i.e. normal APIs) to create an environment that could record everything you ever saw visually and textually. enable you to search it and enable you to recreate the state as it was at that time with non noticeable interruption to the user (processes downtime was like 0.02s).

https://news.ycombinator.com/item?id=33165519

> compsciphd on Oct 12, 2022 | parent | next [–]

> sadly (as with much work form phd students like I was), the closest one could get to it today is trying to duplicate it. i.e. combining criu with nilfs (but a lot of the work that we did to get process downtime to minimal numbers requires being in kernel, as described in paper) and unsure criu can do it.

> In addition our screenrecording mechanism was our own "proprietary" (not really proprietary as fully described in research papers, but also not a standard) and something that was built as an X display driver 15 years ago (so not directly usable today even if code is available). Could probably duplicate it with vnc based screencasting. vnc didn't work for us as we needed better performance (i.e. it was built to demonstrate remote display of video and games and there was no real remote audio setup back then so we had to create our own).

> the "text" search just used gnome's accessible API much like a screenreader would do (with a bit of per application optimizations as can filter out things like menus and the like, primarily was to dump text out of terminals, firefox and perhaps open office and maybe even a pdf reader if memory serves me correctly, but a long time ago).

I've been looking into it myself though, mostly for forensic concerns (capturing state/evidence in a way that's harder to forge than screenshots). Hopefully, running the desktop environment through VNC and then running `criu dump --leave-stopped --prev-images-dir …` immediately followed by `mkcp --snapshot …` (or `btrfs subvolume snapshot …` would also work, I guess) would be enough for basic functionality.

> vnc didn't work for us as we needed better performance

15 years is a long time, performance wise VNC would be adequate and we're at the point now where OCRing a render is probably fast enough for most use cases (it's actually the most reasonable choice a lot of the time, e.g.: PDF parsing).

(But how's VNC support looking on Wayland based window managers?)

VNC is available for pretty much any Wayland window manager but your distro may opt to default build the more modern RDP instead (e.g. Ubuntu Gnome).
> I've been looking into it myself though, mostly for forensic concerns (capturing state/evidence in a way that's harder to forge than screenshots).

For what kinds of surveillance? Why is the forgery-proofing important for the playbacks?

one thing I'd note: we didn't patch the kernel source, everything we did was through the module interface, though we did abuse it a bit, but a lot of that abuse was to provide our home grown cgroup/namespace like functionality that wasn't around when our checkpoint/restart work started. But it is fair to say because of that abuse, it was fairly tied to a specific set of kernels)

another project I created on the forensic side (steve bellovin asked the Q and I was like, yeah, I know exactly how to build thta) that might then interest you was something we called ISE-T (I See Everything Twice - Catch 22).

https://academiccommons.columbia.edu/doi/10.7916/D8HQ45MK

Two-Person Control Administration: Preventing Administration Faults through Duplication

Modern computing systems are complex and difficult to administer, making them more prone to system administration faults. Faults can occur simply due to mistakes in the process of administering a complex system. These mistakes can make the system insecure or unavailable. Faults can also occur due to a malicious act of the system administrator. Systems provide little protection against system administrators who install a backdoor or otherwise hide their actions. To prevent these types of system administration faults, we created ISE-T (I See Everything Twice), a system that applies the two-person control model to system administration. ISE-T requires two separate system administrators to perform each administration task. ISE-T then compares the results of the two administrators’ actions for equivalence. ISE-T only applies the results of the actions to the real system if they are equivalent. This provides a higher level of assurance that administration tasks are completed in a manner that will not introduce faults into the system. While the two-person control model is expensive, it is a natural fit for many financial, government, and military systems that require higher levels of assurance. We implemented a prototype ISE-T system for Linux using virtual machines and a unioning file system. Using this system, we conducted a real user study to test its ability to capture changes performed by separate system administrators and compare them for equivalence. Our results show that ISE-T is effective at determining equivalence for many common administration tasks, even when administrators perform those tasks in different ways.

I should note that the paper also discusses that 2 people might be expensive, so the same mechanism can be used by a single admin but in a manner that maintains an audit trail.

The above project wouldn't require any kernel modifications as the work was all about using unionfs (using normal vfs loadable module interface hooks) to capture changes and user spaces to log and compare them.

All this work led to what can be viewed as a proto-docker - https://www.usenix.org/legacy/events/atc10/tech/full_papers/... and https://www.usenix.org/legacy/events/lisa11/tech/full_papers...

Is the url correct? I get a file not found when I try to open it.
Ah, gotta love link rot.

This one works : https://www.cs.columbia.edu/~nieh/pubs/sosp2007_dejaview.pdf

This one’s broken: https://www.cs.columbia.edu/~orenl/papers/sosp07-dejaview.pd...

When I search for “dejaview” in the main index, I get the same broken link in the search results[0]. At first I thought they’d changed the URL structure (/pubs/ to /papers/), but if you visit [1] it works, but [2] doesn’t work. I guess “orenl” isn’t a member of the faculty anymore, so they tore down their page and removed all the associated resources.

[0] https://www.cs.columbia.edu/g-search/?q=dejaview#gsc.tab=0&g...

[1] https://www.cs.columbia.edu/~nieh/

[2] https://www.cs.columbia.edu/~orenl/

Dr Oren Laadan's site hasn't been updated since 2011, but it's still up: it's just HTTP-only.
HTTPS isn't a drop-in replacement for HTTP. Try visiting the HTTP site.
Great project name!
thanks, I came up with it :) (originally title which i also came up with, but had to be changed to keep double blindness due to a snafu on a previous submission, was ThincBack, because Thinc was our home grown remote display protocol, which became the basis for VESA's net2display standard, though unsure anything really ever happened with that in practice after it was published)
Back in the early '00s there was be a Canadian cable channel called exactly the same. It still exists apparently - https://www.dejaviewtv.ca