I look after a lot of systems, and most of them want identical and simple backups taking of their filesystems. Currently I use backup2l which works but suffers from a couple of minor issues.
In short I want to take a full filesystem backup (i.e. Backup "/"). I wish to only exclude a few directories and mounted filesystems.
So my configuration looks like this:
# List of directories to make backups of. # All paths MUST be absolute and start with a '/'! SRCLIST=( / /boot -xdev ) # The following expression specifies the files not to be archived. SKIPCOND=( -path '/var/backups/localhost' -o -path '/var/run/' -o \ -path '/tmp' -o -path '/var/tmp' \ -o -path '/dev' -o -path '/spam' \ -o -path '/var/spool/' )
The only surprising thing here is that I abuse the internals of backup2l because I know that it uses "find" to build up a list of files - so I sneakily add int "-xdev" to the first argument. This means I don't accidentally backup any mounted gluster filesystem, mounted MySQL binary/log mounts, etc.
backup2l then goes and does its jobs. It allows me to define things to run before and after the backup runs via code like this:
# This user-defined bash function is executed before a backup is made PRE_BACKUP () { if [ -d /etc/backup2l/pre.d/ ]; then run-parts /etc/backup2l/pre.d/ fi }
So what is my gripe? Well I get a daily email, per-system, which shows lots of detail - but the key thing. The important thing. The thing I care about more than anything else, the actual "success" or "fail" result is only discoverable by reading the mail.
If the backup fails, due to out of disk, I won't know unless I read the middle of the mail.
If the pre/post-steps fail I won't know unless I examine the output.
As I said to a colleague today in my view the success or failure of the backup is the combination of each of three distinct steps:
- pre-backup jobs.
- backup itself
- post-backup jobs.
If any of the three fail I want to know. If they succeed then ideally I don't want a mail at all - but if I get one it should have:
Subject: Backup Success - $(hostname) - $(date)
So I've looked around at programs such as backup-ninja, backup-manager and they seem similar. It is a shame as I mostly like backup2l, but in short I want to do the same thing on about 50-250 hosts:
- Dump mysql, optionally.
- Dump postgresql, optionally.
- Dump the filesystem. Incrementals are great, but full copies are probably tolerable.
- Rsync those local filesystem backups to a remote location.
In my case it is usually the rsync-step that fails. Which is horrific if you don't notice (quota exceeded. connection reset by peer. etc). The local backups are good enough for 95% of recovery times - but if the hardware is fried having the backups be available, albeit slowly, is required.
Using GNU Tar incrementally is trivial. If it weren't such a messy program I'd probably be inclined to hack on backup2l - but in 2012 I can't believe I need to.
(Yes, backuppc rocks. So does duplicity. So does amanda. But they're not appropriate here. Sadly.)
ObQuote: "Oh, I get it. I see now. You've been training for two years to take me out, and now here I am. Whew! " - Blade II - An example of a rare breed, a sequel that doesn't suck. No pun intended.
Tags: backup-manager, backup-ninja, backup2l 9 comments
http://pyro.eu.org/
Just wondered what makes AMANDA not appropriate?
Since it is based on gnutar incrementals it supports --one-file-system and --exclude patterns.
Before you run amdump to commence a backup, MySQL databases can be mysqlhotcopy'd to your main filesystem (for example /var/backups/mysql/). This ensures you're backing up a consistent snapshot.
Likewise pg_dump for all Postgres databases.
If the backup data must stored locally and then rsync'd, the backup data can be written to local 'virtual tape' directories first (don't forget to exclude them!).
The email report will have FAIL (or sometimes STRANGE) quite prominently in the subject line if something went wrong; or a prior run of 'amcheck -m -w' may even alert you to a problem before the backup run starts.
What I described here actually means configuring amanda-server on each host. More conventionally they would have each been set up as an amanda-client with a centralised amanda-server controlling them, but that is less flexible.