This blog post is about how to duplicate your Linux services and configuration from one server to another. We use simple and hackable SSH, rsync and shell scripting to copy the necessary files to make a cold spare from your existing server installation.
1. Preface
The method describes here is quite crude and most suitable for making a spare installation of your existing server. In the case you lose your production server, you can boot your cold spare, point your (fail-over) IP address to the spare server and keep business going – something you might want to do if you run mom’n’pop web hosting business. Because of the simplicity of the method it works on default Linux installations, bare metal servers and old-fashioned file systems like Ext4.
The instructions here have been tested with Ubuntu Linux 12.04, but should work with other versions with minor modifications. I have used this method successfully with Heztner hosting (highly recommended for all the cheapskates out there) by having one production machine and one spare machine. The spare is mirrored weekly. Daily Duplicity backups can be restored on the spare if week is too long data loss period. Though in my case the servers are bare metal, the method works for VPSes as well.
The duplication script might be also useful for setting up a staging server from your production server for software developer and testing.
2. More powerful duplication tools
More fail safe, more engineer-oriented duplication approaches exist, but usually require additional preparation and tuning on the top of the default Linux installation. Applying these to existing running Linux installation might be tricky.
- Use virtual machines like KVM and snapshot them, then migrate the snapshotted copy on another server.
- Use a file system supporting snapshots like Brfs or ZFS.
- Replicate your databases using native replication (MySQL, PostgreSQL) – if you cannot afford losing few database rows (hot spare).
- Replicate your file system (hot spare)
- (If you have any more ideas how to make hot or cold spares please post in the blog comments)
3. Building and running the duplication script
This script is run on the target server (cold spare) and it will clone the content of the source server (actual production machine) on itself. It uses SSH keys and SSH agent to create the initial connection, so make sure you are familiar with them.
Assumptions
- The target server must be clean Linux installation, the same exact version as on your source server.
- Your server has standard /etc/passwd user account management. This is copied first so that we correctly preserve file ownership (related ServerFault discussion).
- Services you run (PHP, Django, Plone, Node.js, you-name-it) are under /srv as recommended by Linux filesystem hierarchy standard
- Source and target MySQL servers must have initialy same password for the root. Set this when the script runs apt-get install mysql-server for the first time.
Limitations
- The first run is interactive, as apt-get install asks bunch of stuff for packages like MySQL and Postfix.
- MySQL dumping and loading the dump does not guarantee all of your data survives intact, especially when skip-lock option is used for the performance reason. For ordinary CMS operations this isn’t a big issue.
- If you other services beside MySQL needing special flush or lock handling follow the related instructions.
For MySQL duplication make sure you have the following /root/.my.cnf file on the source server, as it allows you to interact with MySQL:
[mysqldump] user=root password=YOUR PASSWORD HERE [client] user=root password=YOUR PASSWORD HERE
Run the script inside a screen, because the duplication process may take a while and you do not want to risk losing the process because your local internet connectivity issue.
scp mirror-ubuntu.bash targetserver:/tmp ssh targetserver screen cd /tmp bash mirror-ubuntu.bash
4. Testing the spare server
After the duplication script has successfully finished mirroring the server you want to check if the services can be successful started and interacted on the cold spare.
Change internet-facing IP addresses in the related services to be the public IP address of the spare server. E.g. the following files might need changes:
- /etc/default/varnish
- /etc/apache2/ports.conf
- /etc/apache2/sites-available/*
Spoof your local DNS to point the spare server on the tested sites. Edit your local /etc/hosts and add spoofed IP addresses like:
1.2.3.4 www.service1.example.com www.service2.example.com opensourcehacker.com
Access the sites from your web browser, see that database interaction works (login) and file interaction works (upload and download files).
5. mirror-ubuntu.bash
#!/bin/bash # # Linux server ghetto duplication script # Copyright 2014 Mikko Ohtamaa http://opensourcehacker.com # Licensed under MIT # # Everything is copied from this server SOURCE=root@myserv.example.com # Our network-traffic and speed optimized rsync command RSYNC="rsync -a --inplace --delete --compress-level=9" # Which marker string we use to detect custom init.d scripts INIT_SCRIPT_MARKER="### BEGIN INIT INFO" # As we will run in screen we need to detach # from the forwarded SSH agent session and we use a local # SSH key to perform the operations. # Also overriding /root destroys our key. # Create a key we use for the remaining operations. TEMP_SSH_KEY=/tmp/mirror_rsa # The software stack might have touched the following places. # This list is compliled through trial-and-error, # sweat and cursing. # We cannot take /etc as a whole, because it contains # some computer specific stuff (hard disk ids, etc.) # and copying it as a whole leads to unbootable system. CHERRYPICKED_ETC_TARGETS=(\ "/etc/default" \ "/etc/varnish" \ "/etc/apache2" \ "/etc/ssl" \ "/etc/nginx" \ "/etc/postfix" \ "/etc/php5" \ "/etc/cron.d" \ "/etc/cron.daily" \ "/etc/cron.monthly" \ "/etc/cron.weekly" \ "/etc/init.d") # Create a key without a passphrase # and put the public key on the source server rm $TEMP_SSH_KEY 2>&1 > /dev/null ssh-keygen -N '' -f $TEMP_SSH_KEY ssh-copy-id -i $TEMP_SSH_KEY $SOURCE # Detach from the currently used SSH agent # by starting a session specific to this shell eval `ssh-agent` ssh-add $TEMP_SSH_KEY # Assume the system have same Ubuntu base installation and no extra repositories configured. # Bring target system up to date. apt-get update -y apt-get upgrade -y # TODO: check that the kernel uname is same # on the source and the target systems # This is somewhat crude method to try to install all the packages on the source server. # If the package is missing or replaced this command will happily churn over it # (apt-get may fail). This MAY cause user interaction with packages # like Postfix and MySQL which prompt for initial password. Not sure # what would be the best way to handle this? ssh $SOURCE dpkg --get-selections|grep --invert-match deinstall|cut -f 1|while read pkg do apt-get install -y $pkg done # As some packages might have changed due to version upgrades and # deb renames the following step needs interaction ssh $SOURCE dpkg --get-selections|grep --invert-match deinstall|cut -f 1 # Copy user credentials first to make sure we # get the user permissions and ownership correctly. # http://serverfault.com/a/583336/74975 echo "Copying users" $RSYNC $SOURCE:/etc/passwd /etc $RSYNC $SOURCE:/etc/shadow /etc $RSYNC $SOURCE:/etc/group /etc $RSYNC $SOURCE:/etc/gshadow /etc # Copy home so we have user home folders available # Skip duplicity backup signatures echo "Copying root" $RSYNC $SOURCE:/root / --exclude=/root/.cache # lost+found content is generated by fsck, uninteresting echo "Copying home" $RSYNC $SOURCE:/home / --exclude=/home/lost+found echo "Copying /etc targets" for i in "${CHERRYPICKED_ETC_TARGETS[@]}" do $RSYNC $SOURCE:$i /etc done # Most of your service specific stuff should come here echo "Copying /srv" $RSYNC $SOURCE:/srv / # Make sure stuff which went to /etc/init.d gets correctly reflecte across runlevels, # as our /srv stuff has placed its own init scripts for i in /etc/init.d/* do service=`basename $i` # Separate from upstart etc. jobs if grep --quiet "$INIT_SCRIPT_MARKER" $i ; then update-rc.d $service defaults fi done # Copy MySQL databases. # Assume source and target root can connect to MySQL without a password. # You need to set up /root/.my.cnf file on the source server first # for the passwordless action. # http://stackoverflow.com/a/9293090/315168 echo "Copying MySQL databases" ssh -C $SOURCE mysqldump \ --skip-lock-tables \ --add-drop-table \ --add-drop-database \ --compact \ --all-databases \ > /root/all-mysql.sql # MySQL dump restore woes # http://stackoverflow.com/a/21087946/315168 mysql -u root -e "SET FOREIGN_KEY_CHECKS = 0; source /root/all-mysql.sql ; SET FOREIGN_KEY_CHECKS = 1;" # Remove the key we used for the duplication rm $TEMP_SSH_KEY
Subscribe to RSS feed Follow me on Twitter Follow me on Facebook Follow me Google+
Sorry to say, when I read this, I got some shivery, so I must oppose!
There is no general recipe that can build you fault tolerance from scratch in 15 minutes with a simple script, working everybody, not even a slight majority. You have to understand your application, what data you have and which services you need to run.
For example, your thing here just works assuming everybody to use it uses either nginx or apache, and mysql. How can you know?
There are thousands of other variants, and then this thing will fail.
Then you only sync a minor part of /etc and say systems will be the same – they won’t. They are only in very small areas, and as long as you don’t really know what’s important for your application, there’s a whole lot of place for unexpected things to happen.
This script will work for you, but it doesn’t work for me, and will only work for very few people in the same fashion.
What works?
To ensure you have two systems being the same, you should learn a proper configuration management system and use it. You could also build something similar with a bunch of bash scripts, but believe me, you’re far better off learning ansible, salt, puppet or chef properly, and build all your configs with one of those instead of manuall fiddling with configs in /etc and copying partial states with rsync.
Data replication and stand-by is a bit more complex, you need to know exactly what state data you have, and how to properly copy and continuously sync it.
Maybe you start with a proper backup procedure and a tested restore that reliably get’s a second machine up and running in minutes.
You can use database replication mechamisms that come with your DB product, and some kind of network filesystem, replicating block storage, or actually really rsync to sync larger file stores.
Please. Don’t tell your readers everything can be done with just a simple script in a few minutes without further thinking.
Everybody trying to do serious business, has, sooner or later,
to take the time to really understand the system your application is running on and all the dependencies, and figure how they work together, where they have their data, how multipls datastores are connected(e.g. database entries pointing to image files in the filesystem) and take some time and effort to make all these work together in a synchronized fashion – using standard tools avilable on Linux&Unix systems like shell scripting, rsync, and ssh, but also some more advanced tools and techniques for backup, data synchronisation and config management.
That’s effort. That’s nothing that just goes quick, it requires the will to dive deep in and do things right.
> Please. Don’t tell your readers everything can be done with just a simple script in a few minutes without further thinking.
I didn’t.
This script is more like guidelining hack-it-together and I specifically mention some more professional tools.
I guess you missed the point of the blog post.
Thanks for finally, after weeks, having the balls to publish my comment.
First of all, in communication theory, there’s a notion that it’s the senders responsibility, to write stuff in a way that ensures people understand the message right…
If you can’t deal with criticism, you should think about stopping writing publicly.
As of your defense:
You don’t say the solution you describe is major error prone and does not in any way replace people knowing the systems they run very well and handle basic tools like rsync, knowledge of Linux/Unix configuration, dealing with backups, and if the need is there,
Copying a kvm dump, which you mention as a professional way, is also pretty error prone, as you’ll have host/ip/name specific stuff all around the system – and if you don’t know excactly what you’re doing that can and will lead to a lot of problems.
My point is, this guide doesn’t contain anything that anybody knowing what he’s doing didn’t know before, and it acts as offering a simple solution for those who don’t yet know a lot about systems administration, but exactly for them it’s dangerous to just try copy your script and to anything with it, believing they are on the save side with anything.
They better invest their time in learning a config management automation system, backup/restore strategies, db replication, and general HA strategies – but that’s all nothing a novice will learn in just half a day, but it’s important to do it, properly, and take the time instead of playing around with a half-baked script.
Re-Thinking an calming down, I’d remove the upper part of my last post, as the wording is inappropriate, but keep my opinion that everything I say after “as of your defense” is true.