These instructions help you to copy and transfer production server ZODB database (Data.fs) to your local computer for development and testing. This allows you to do the testing against the copy of real data and the production server Plone instance set up.
See the original tip by cguardia.
Data.fs is ZODB file storage for transactional database. Journal history takes quite a lot of disk space there. Packing, i.e. removing the journal history, usually reduces the size file considerably, making the file lighter for wire transfer. Depending on the database age the packed copy is less than 10% of the original size.
These instructions apply for Ubuntu/Debian based Linux systems. Apply to your own system using the operating system best practices.
We need ZODB Python package to work with the database. To use it, we’ll create virtualenv Python installation in /tmp. In virtualenv installation, installed Python packages do not pollute or break the system wide setup. Note that you might use easy-install-2.4 depending on the OS. The latest stable ZODB can be picked from PyPi listing. Plone 3.x default is ZODB 3.7.x, which is not available as Python egg, but you can use ZODB 3.8.x.
sudo easy-install virtualenvcd /tmp
virtualenv packer
/tmp/packer/bin/easy_install ZODB=3.8.3 Data.fs cannot be modified in-place. You must create a copy of it to work with it. Data.fs copy can be created from a running system without the fear of corrupting the database, since ZODB is append only database.
cp /yoursite/var/filestorage/Data.fs /tmp/Data.fs.copyThen create the following script snippet /tmp/pack.py using your favorite terminal editor.
import time import ZODB.FileStorage import ZODB.serialize storage=ZODB.FileStorage.FileStorage('/tmp/Data.fs.copy') storage.pack(time.time(),ZODB.serialize.referencesf)And run it using virtualenv’ed Python setup with ZODB installed:
/tmp/packer/bin/python /tmp/pack.pyLots of patience here… packing may take a while, but it’s still definitely faster than your Internet connection transfer rate.
Verify that the file is succesfully packed:
ls -lh Data.fs.copy -rw-r--r-- 1 user user 30M 2009-09-01 13:24 Data.fs.copyWoohoo 1 GB was shrunk to 30 MB. Then copy the file to your local computer using scp and place it to your development buildout.
scp user@server:/tmp/Data.fs.copy ~/mybuildout/var/filestorage/Data.fsYou just saved about 30-90 minutes of waiting of file transfer.
I’m often bzipping the database before transfer. bzip2 Data.fs.copy produces Data.fs.copy.bz2 which should be much smaller than original file. Even without pack bzipped database is much faster to transfer (and transfer-error safe)
There is also fsrecover.py which has been around for ages, which can pack a Data.fs not in place (as well as checks it for consistency):
parts/zope2/lib/python/ZODB/fsrecover.py -P 0 /path/to/existing/Data.fs /path/to/new/Data.fs
You might have to set your PYTHONPATH variable to point to the zope lib directory, e.g.:
export PYTHONPATH=/Development/demo3/demo3/parts/zope2/lib/python
IIRC this will work on a running live Data.fs, so you can create a packed copy in your home directory before copying down for local development.
-Matt
adding the following snippet to your `buildout.cfg` essentially wraps matt’s recipe into a convenience script (i hope the line breaks will survive :)):
[buildout]
…
parts += packer
[packer]
recipe = zc.recipe.egg
eggs =
entry-points = packer=ZODB.fsrecover:main
extra-paths = ${zope2:location}/lib/python
scripts = packer
initialization = sys.argv[1:1] = [‘-P0’, ‘var/filestorage/Data.fs’]
after running buildout you can now simply invoke:
$ bin/packer
which can then be downloaded as shown above…
heh, it seems anything that looks like html is filtered here — the command invokation was meant to have an extra parameter, like so:
$ bin/packer path/to/packed/Data.fs
this will save a packed version of you current `Data.fs` into the given file…
I use repozo.py to back up so I have the full version and deltas shipped remotely each night. Then I can recover the Data.fs again using repozo and use that copy for testing (and verify my backups at the same time). In practise you need to either do a full repozo back every few days or do a pack which forces a full repozo backup the next time. I do the latter.
How about this: doing the whole repoze packing and copying using one or two SSH commands without the need to login to the server shell?
Andi,
Great! 🙂 The power of buildout and some knowledge 😉
-Matt
mikko,
yep, that was actually my “goal” when i put together that buildout snippet. unfortunately, though, `fsrecover` needs to be given a proper file name, i.e. writing things to stdout directly doesn’t work. at least not ootb.
otherwise you could of course simply use something like:
$ rsync -az plone@server:~/bin/packer > Data.fs
eek, that should have been:
$ ssh plone@server bin/packer > Data.fs
of course… /me too sleepy again 🙂
Once you pack your db, there is: http://pypi.python.org/pypi/collective.recipe.rsync_datafs/0.1 which lets you do (in UNIX, with rsync in your PATH only):
[buildout]
parts = database
[database]
source = remotehost:/path/to/packed/Data.fs
target = ${buildout:directory}/var/filestorage
I find using this recipe makes it easy to grab a remote Data.fs to develop with locally.
Even better challenge. Get those ssh snippets into a fabfile, and put it into a collective.hostout.filestorage recipe the following would work.
[host1]
recipe=collective.hostout
host=myhost.com
path=/remotepath
extends=hostfs
[hostfs]
recipe=collective.hostout.filestorage
Once added to your local buildout you can then run
$ bin/hostout pullfs hostfs
$ bin/hostout pushfs hostfs
The hostout plugins are new but there should be enough there to do this
http://pypi.python.org/pypi/collective.hostout
/tmp/packer/bin/easy_install ZODB=3.8.3
should be
/tmp/packer/bin/easy_install ZODB==3.8.3
/tmp/packer/bin/easy_install ZODB=3.8.3
should be
/tmp/packer/bin/easy_install ZODB3==3.8.3