sync2cd: an incremental archiving tool to CD/DVD

Description | Download | Usage | Configuration | FAQ | To do | Feedback | Links

Current version: 0.7

Description

sync2cd is an incremental archiving tool. It allows backing up complete filesystem hierarchies to multiple backup media (e.g. CD-R). Files are archived incrementally, i.e. only new or changed files are stored during an archive operation.

Currently, only files, directories and symbolic links are stored.

Download

The current version can be downloaded here.

sync2cd is implemented in Python. This means it doesn't need to be installed. Just copy it into a directory specified in your PATH, make sure it is executable, and you should be all set. If running sync2cd.py gives you the error "/usr/bin/env: python2: No such file or directory", replace python2 by python in the first line of the program.

sync2cd has been tested with Python 2.2.2 on Linux. It might work with other versions and/or platforms, YMMV. If you were able to make it work with another configuration, please drop me a line (and ideally, send me a patch ;-)

Usage

Basic usage information is output with sync2cd.py --help:

    [joe@hobbes sync2cd]$ ./sync2cd.py --help
    sync2cd.py 0.7   Synchronization to CD-R
    Copyright 2003 Remy Blank

    Usage: sync2cd.py [commands] [options] ConfigFile

    Commands: -c, --create         Create a new archive descriptor
              -g, --graft-list     Output a graft list for an archive
              -h, --help           Show this text
              -p, --print          Print archive information
              -s, --status         Print current synchronization status

    Options:  -a N, --archive N    Operate on archive number N
              -v, --verbose        Be more verbose

Commands define what will be done and what will be output to stdout. Several commands can be specified at the same time, and will be executed in a sensitive order (e.g. --create before --graft-list). Options allow passing arguments to the selected commands.

-a N, --archive N
Specify that the selected command(s) should be executed on archive N. Note that this option will have no effect with --create and will be overridden by the newly created archive.

-c, --create
Create a new archive, containing the files with the oldest modification date in the inputs specified in the configuration file, up to the capacity of one archive medium. This will create a new archive descriptor, with the same base name as the configuration file and a numbered extension, starting at 1 if no descriptor exists yet.

-g, --graft-list
Output a list of graft-points to stdout, in the format expected by mkisofs with the -graft-points option.

-h, --help
Show some basic usage information and exits.

-p, --print
Print information about an archive to stdout. If --verbose is also specified, the list of files contained in the archive is also output.

-s, --status
Print the status for a backup set, i.e. the total size of all files that need to be archived. If --verbose is also specified, the list of files that need to be archived is also output.

-v, --verbose
Output more information to stdout for various commands.

Here are a few basic examples:

Configuration file format

The configuration file is actually Python code calling functions defined in sync2cd.py and passing configuration information. The functions available are described below.

ArchiveSize(Value)

Set the maximum size of an archive to Value. This is typically used to span a backup over multiple media.

Value is an integer giving the size in bytes, or a string containing a number optionally followed by the suffix k, M, G, T, P, E.

Default:0 (no limit)
Example:ArchiveSize("690M")

BaseDir(Dir)

Set the current working directory to Dir before starting. All paths specified with Input() are relative to this directory. This option corresponds to the -C or --directory option of tar.

Default:. (current directory)
Example:BaseDir("/home")

ExcludeGlob(Pattern)

Exclude files and symlinks matching the shell pattern Pattern from the archive. Several exclude patterns can be specified. The pattern matching is done against the path relative to BaseDir().

It is not currently possible to exclude directories from the archive.

Example:ExcludeGlob("Music/Country/*.mp3")

This excludes all mp3 files in Music/Country, but files in subdirectories will still be included.

ExcludeRegEx(Pattern)

Exclude files and symlinks matching the regular expression Pattern from the archive. Several exclude patterns can be specified. The pattern matching is done against the path relative to BaseDir(). For more information about regular expression syntax in Python, see this page.

It is not currently possible to exclude directories from the archive.

Example:ExcludeRegEx("Music/Country/([^/]+/)*\\.mp3")

This excludes all mp3 files in Music/Country and in all subdirectories.

HashFunction(Hash)

Specify the hash function to be used to check files for content modification. Currently supported: md5 (128 bits), sha1 (160 bits).

Default:md5
Example:HashFunction("sha1")

Input(Path)

Add a file or directory to be archived. Several inputs can be specified. The use of a directory name always implies that the subdirectories below should be included in the archive. Path must be a relative path specification, and is interpreted relative to BaseDir().

Example:Input("Music")

Here are a few examples of configuration files.

Frequently asked questions

(Not that so many people actually asked...)

To do

The following features will be added to sync2cd as time permits.

Feedback

If you are using or trying to use sync2cd, I would be happy to hear about you! I'm especially interested in the following:

In any case, just drop me an e-mail.

Links



Copyright 2004 Remy Blank