When the part lands on a server, it initiates a periodic pulse -- a sequence of message sent from that server back to the base server from where the part had originated. Each message traverses the so far traveled path in the backward direction, and drops a little encrypted file that contains a reference to the previous server. This way, when a user gets the file from the store, the servers follow the tracks of each part.
When a part leaves a server to another server, it stops the current pulse -- the target server initiates a new pulse.
Each pulse erases the previous track file on each server it travels through.
Currently, no fail-over of any kind is implemented in Cryptomove.
If one or more servers go down, all parts that currently reside on those servers obviously remain there. When a down server comes back again, it restarts the movement of all parts that used to reside there before the failure.
This may hamper delivery of data parts upon restore request in case some parts reside on the down server. However, the parts of the saved file is always duplicated on the client before they are directed to the servers. Thus, if enough servers are still up, the restore request may still fetch copies that are still on the up servers, and which path back to the base also goes through the up servers.
Again, currently copies of the same data part travel independently and randomly. In the worst case scenario it may happen that all of them end up on the down server, or that for all of them the path back to their base server has a down server. This however, seems unlikely if there is enough copies and up servers.
Also, when a server decides to push a data part onto another server, it only does it onto a server that is up. All servers maintain keep-alive heartbeats with the members of their clusters, so they know which cluster servers are up and which are down. Of course, it may happen a server goes down in the middle of a data piece transmission. In this case, if it is the source server, it will restart transmission upon its own restart. If it is the target server, the source server will receive a timeout or an exception, and will re-transmit the same part later to an online server (might even be the same target server that went down in case it had come back again).
When a part leaves a server to another server, it stops the current pulse -- the target server initiates a new pulse.
Each pulse erases the previous track file on each server it travels through.