Backup and archive with rsync

Rsync is a powerful and quick utility. It provides you the ability to synchronize two directories and archive the differences. (Note that I consider a “backup” an identical copy and “archive” a history of changes; while rsync calls the changed/deleted files a “backup.”) Following a discussion on the mailing list about adding an archiving solution to your backup system, I answered a question about rysnc with the following examples.

Beginning with the basics…

rsync A B

…would ensure that directory B has copies of all of the files in A. This does not necessarily mean that they will be identical however (B could have more files then A). If you want B to be an exact back up of A, then you must purge the difference out of B:

rsync --delete A B

You will also want to add recursion so that it will include all sub directories:

rsync -r --delete A B

If you are going to use the recursive flag, you might as well use archive instead though as it executes “rlptgoD” (which is what you want and more—check the manual for what each do) all at the same time:

rsync -a --delete A B

And, while we’re at it, we might as well toss in the zip (to make things quick) and the extended attributes (realizing that Mac OS support isn’t 100%, but it’s better then not trying) flags as well:

rsync -azE --delete A B

So, now that we have an identical backup (and hopefully two or three), we need to start archiving the changes instead of allowing theme to be overwritten or deleted. This is done with the backup flag:

rsync -azEb --delete A B

Running that with default settings will save a copy of the old file (next to the one that was just updated in the backup), appended with a tilde (so that the filenames are different… so that it isn’t overwriting the backup). Saving this archived version into the backup is no good though since we are using the delete flag! So, we need to specify an alternate location for the archived files to be saved:

rsync -azEb --delete --backup-dir=/path/to/archive/ A B

Another benefit of saving the archived files in a separate location is that they do not need to be renamed to “file.txt~”.

We almost have it now but there is one more important point. If we run the above command as is, we will have a source, a backup, and a one step back archive. Each time a file is changed (and the command is then run), the file will be synchronized to the backup and the changes that were overwritten or deleted will be copied to the archive… but, when they are copied to the archive, they will overwrite the previous archival and although a better solution then no archive at all, it’s not a good one since you cannot reach very far back in time.

Thus, we take advantage of the suffix flag. With the suffix flag, we can append a string of our choice to the end of an archived file. (We could have used this earlier to replace the tilde with something else of our choosing.) Simply appending something to the end of the file (file.txt.old) isn’t sufficient however; the suffix must be unique each time the script runs (such as the date and time).

Thus, the command that I would recommend for a backup and archive solution using rsync, is:

rsync -azEb --delete --backup-dir=/path/to/archive/
--suffix=.2007-04-16_07-22-03 /path/to/source/ /path/to/backup/

If you are like me and like to see what it’s doing, I would also recommend throwing in the progress, verbose, and stats flags as well:

rsync -azEbv --progress --stats --delete
--backup-dir=/path/to/archive/ --suffix=.2007-04-16_07-22-03
/path/to/source/ /path/to/backup/

And, if you do not want to have to manually update the date in the suffix flag each time you run the command (if you are using launchd or cron to run it for example), you will want to replace the suffix value with a variable that is set in a shell script, like this:

#!/bin/sh

right_now=$(date +"%y-%m-%d_%H-%M-%S")

rsync -azEbv --progress --stats --delete --backup-dir=archive
--suffix=.$right_now files backup


About this entry