| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by laymil 4340 days ago

And it's interesting and useful for scientific computing where you already have an MPI environment and distributed/parallel filesystems. However, it's not really applicable to this workload, as the paper itself says.

There is a provision in most file systems to use links (symlinks, hardlinks, etc.). Links can cause cycles in the file tree, which would result in a traversal algorithm going into an infinite loop. To prevent this from happening, we ignore links in the file tree during traversal. We note that the algorithms we propose in the paper will duplicate effort proportional to the number of hardlinks. However, in real world production systems, such as in LANL (and others), for simplicity, the parallel filesystems are generally not POSIX compliant, that is, they do not use hard links, inodes, and symlinks. So, our assumption holds.

The reason this cp took such large amounts of time was the desire to preserve hardlinks and the resize of the hashtable used to track the device and inode of the source and destination files.

2 comments

encoderer 4340 days ago

Sure, but if you read that article you walk away with a sense of thats a lot of files to copy. And the GP built a tool for jobs 2-3 orders of magnitude larger?! Clearly there are tradeoffs forced on you at that size...

link

jlafon 4339 days ago

Author of the paper here. The file operations are distributed strictly without links, otherwise we could make no guarantees that work wouldn't be duplicated, or even that the algorithm would terminate. We were lucky in that because the parallel file system itself wasn't POSIX, so we didn't have to make our tools POSIX either.

link