Hacker News new | ask | show | jobs
by adrianratnapala 3595 days ago
I think what the josefbacik means is that there is no guarantee that the original symlink under its temp name has actually been written to disk. After the `rename()`, there is still no guarantee.

I think this is true in many filesytems, not just `xfs`.

If your atomicity requirement you never want the file to disappear from the POV of an external process, then the OP's method is sufficient. If you want crash-proofing as well, then you will need an fsync() -- preferrably on the tempfile BEFORE the rename().

1 comments

No, you want to fsync() on the directory, not the "tempfile", and after the operation, not before. Consider:

            d=open("."), unlink("t");
    /* 1 */ symlink("new","t");
    /* 2 */ rename("t", "link");
    /* 3 */ fsync(d);
    /* 4 */ close(d);
Crashing, at (1) nothing has happened yet, (2) we might have "t" or we might not, (3) we might have "t", or not, and we might have "link" pointing to "new" or "old", but we can't have "link" pointing to anything else (or empty), and finally at (4) the change cannot be reverted.

You can insert a second fsync() where you suggest at point (2), but all this will guarantee is that we will have "t" in the directory because the symlink contents are part of the directory they live in. This might be useful for some applications, but the cost of two disk writes is high enough it may be worth redesigning your application.

If you crash at (3) you can -- at least in principle -- have "link" pointing to garbage (most likely an empty file). That is, the dirent points to the new inode, but the actual link-text got lost with the crash.

Now on modern filesystems, a non-huge symlink will be stored in the inode itself and presumably enjoys some sort of atomicity. But there is nothing in the standard about that.

> If you crash at (3) you can -- at least in principle -- have "link" pointing to garbage (most likely an empty file).

No, I don't think you can, bugs notwithstanding. A "link" (§3.130) is what POSIX calls a directory entry.

> But there is nothing in the standard about that.

The "standard" (POSIX) doesn't talk much about crashing, however if mkdir("a") could destroy "b" – even during a system crash (§3.387), then users would complain.