Hacker News new | ask | show | jobs
by yid 4763 days ago
Not sure how this is different from inotify+md5+make?
3 comments

Watchman uses inotify under the covers (or kqueue or portfs, depending on the OS) and abstracts the differences away.

For the Facebook www build it is no longer practical to hash every file to see if it changed because there are so many that it is pretty common for the files to have fallen out of the buffer cache. Attempting to hash the files can thus lead to a significant amount of I/O and translates directly to an increased wait time for the user.

In addition, because of the volume of files, it is not feasible for us to statically declare the build dependencies using a traditional Makefile or similar tool; it is crazy to maintain manually and generating the mapping is itself an expensive operation.

We chose to implement this in C because because it gave us tight and deliberate control of the resources and dependencies of the service.

I often think it would be useful for a filesystem to provide a digest of a file's content. It's the FS that knows when the file's content changes and the digest is out of date, and it also only needs to recalculate if anything asks. It wouldn't necessarily have to read all the bytes of the file to re-calculate; it may be a hierarchical digest where much of the existing, stored, labour can be re-used.
What would be a good start would be portable and reliable snapshots with differencing managed by the kernel. Some filesystems offer this capability but it is not available to us in this particular circumstance.
This is cross-platform, which inotify isn't. Also, isn't inotify pretty low-level? It looks like FB has consolidated common inotify helper code into a reusable daemon.
http://people.gnome.org/~veillard/gamin/ has been doing this for ages, with inotify on Linux and kqueue on BSD. It's already packaged and available - indeed, often required - on many distributions. Facebook has done some neat stuff here, but providing a portable file-change-notification library isn't novel.
Not sure how inotify is different from while+stat?
Your snark is unwarranted. While+stat in a loop is a poll loop in userspace, inotify is a push mechanism in kernel space.

My point was that the additional functionality of this significantly-sized package beyond running inotify+md5+make in a shell script was unclear.

Trivial example use of inotify (or rather, inotifywait) to restart Apache after you edit the Apache conf file or add/edit/remove a site config:

    while inotifywait -e attrib,modify /etc/httpd/conf/httpd.conf -e attrib,modify,create,delete,move -r /etc/httpd/sites-enabled ; do
      /sbin/service httpd graceful && echo "`date -u --rfc-3339=seconds` httpd graceful" >> /etc/httpd/conf/httpd-conf.log
    done
Shows how to monitor a single file and a directory, and do a sequence of commands if events happen. Unlike while stat, this doesn't spam checking the file system for changes, it waits until the kernel notifies that a change happened.

To "daemonize" this, tack on an invocation test, perhaps:

    #!/bin/bash
    if [ "x$1" != "x--" ]; then
    $0 -- 1> /etc/httpd/conf/watchconf.log 2> /etc/httpd/conf/watchconf-err.log &
    exit 0
    fi
    
    while inotifywait -e attrib,modify /etc/httpd/conf/httpd.conf -e attrib,modify,create,delete,move -r /etc/httpd/sites-enabled ; do
      /sbin/service httpd graceful && echo "`date -u --rfc-3339=seconds` httpd graceful" >> /etc/httpd/conf/httpd-conf.log
    done