Most of my websites run on Amazon’s EC2 servers and in setting up a new one today I thought I had better come up with a better backup strategy. In the past I’ve just asked EC2 to create either AMI or EBS snapshots, but they aren’t done automatically and it’s not a very elegant solution.
Much better would be simply to backup to S3. This is easily done through a little script that sits in your /etc/cron.daily so that it is automatically called once every day. But I’d like to have more than just one backup - I’d like some kind of automatic rotation of backups. In the end I came up with the a reasonably cute idea, and that is to keep between 28 and 31 backups: ie, all the data that I’m backup up is pushed into a folder named after the day-of-the-month in an S3 bucket, so today (7th June 2013) all my backups are going into a folder named something like s3://mybackups/07
In a month’s time (7th July 2013) this backup will be overwritten by the July 7th backup. That’s not a bad solution really. If you want longer backups you can hack the below script and have two scripts - one for a one-backup-per-day strategy and also another copy of the script that stores by month name which essentially rotates by month. That would give you daily backups for the past month, and monthly backups for the past year. Useful.
In terms of your standard webserver you want to backup your /var/www (or wherever you keep your htdocs), along with any config info, so I also backup /etc/apache2 /etc/php5 /etc/mysql /etc/cron.daily. Of course on top of that you’ll need a backup of your database which you can get by calling mysqldump. Then compress the lot and chuck it up to s3. You’ll want to use s3cmd for this. Thankfully I found a script which I modified to do the rotations and use s3cmd to upload the resulting backups (original source: http://stnor.wordpress.com/2011/08/01/backing-up-ec2-mysql-and-configuration-files-to-s3/)
So first you’ll need a copy of s3cmd installing. On ubuntu/debian that is a fairly straight-forward:
|
|
This is important: Do NOT configure s3cmd with your root AWS credentials - yes it will work, but would you store your root server password in a plaintext file? No, and your AWS credentials give the holder access to unlimited resources, your billing details, your machine images, everything. Just watch this 2-minute you-tube video on creating AWS users & groups with restricted access, create a new user/group that only has access to S3 and use those credentials to configure s3. It’s not hard, it’ll take you just a few minutes to do. Then wait a couple more minutes for these new credentials to propagate through amazon’s systems and you’re ready to carry on. Ok, safety first rant over! Let’s continue.
Then modify the following script to suit your purposes:
<li>Add mysql login details for each DB in the format: databasename_USER and databasename_PW</li>
<li>Specify which directories to backup in DIRECTORIES - for me that is config stuff and my /var/www</li>
<li>Specify the name of the s3 bucket you're going to backup into in the S3_BUCKET_URL</li>
The script also assumes you have tar and gzip installed, but I’ll assume you can figure that bit out for yourself.
|
|
Then, assuming you’ve called it something like backupToS3.sh, make it executable and test it:
|
|
Once you’ve ironed out any issues simply copy it over to /etc/cron.daily so that it runs daily:
|
|
Now, the above script does daily backups, but if you want to do monthly backups you simply need to make a copy of the file (since you’ll likely want a daily and monthly backup rotation) and edit the DATE variable to use months rather than day-of-the-month. If you use the month number you’ll probably want to either prefix the month number with the word “month”, or pop them into a subdirectory called “monthly”, alternatively you could use the month name, for instance:
|
|
Then make it executable and test it as you did the previous script, and then copy it into cron.monthly:
|
|
Presumably this will then fire on the first of the month (I haven’t checked), but you could always put it in cron.daily so that monthly backup is from the last day of its month (for previous months, the present month would be up to date).