Many of us use rsync to shuffle data around, either to maintain off-site backups, or to perform random tasks (e.g. uploading a static copy of your generated blog).
I use rsync in many ways myself, but the main thing I use it for is to copy backups across a number of hosts. (Either actual backups, or stores of Maildirs, or similar.)
Imagine you backup your MySQL database to a local system, and you keep five days of history in case of accidental error and deletion. Chances are that you'll have something like this:
/var/backups/mysql/0/ /var/backups/mysql/1/ /var/backups/mysql/2/ /var/backups/mysql/3/ /var/backups/mysql/4/
(Here I guess it is obvious that you backup to /mysql/0, after rotating the contents of 0->1, 1->2, 2->3, & 3->4)
Now consider what happens when that rotation happens and you rsync to an off-site location afterward: You're copying way more data around than you need to because each directory will have different content every day.
To solve this I moved to storing my backups in directories such as this:
/var/backups/mysql/9-03-2009/ /var/backups/mysql/10-03-2009/ /var/backups/mysql/11-03-2009/ ..
This probably simplifies the backup process a little too: just backup to $(date +%d-%m-%Y) after removing any directory older than four days.
Imagine you rsync now? The contents of previous days won't change at all, so you'll end up moving significantly less data around.
This is a deliberately contrived and simple example, but it also applies to common everyday logfiles such as /var/log/syslog, syslog.1, syslog.2.gz etc.
For example on my systems qpsmtpd.log is huge, and my apache access.log files are also very large.
Perhaps food for thought? One of those things that is obvious when you think about it, but doesn't jump out at you unless you schedule rsync to run very frequently and notice that it doesn't work as well as it "should".
ObFilm: Star Wars. The Family Guy version ;)
Tags: rsync, sysadmin, tips 12 comments