sync2cd: an incremental archiving tool to CD/DVD

Description | What's new | Download | Usage | Configuration | FAQ | To do | Feedback | Links

Current version: 1.3 (2007.04.09)

Description

sync2cd is an incremental archiving tool. It allows backing up complete filesystem hierarchies to multiple backup media (e.g. CD-R). Files are archived incrementally, i.e. only new or changed files are stored during an archive operation.

All entity types are supported: directories, files, symlinks, named pipes, sockets, block and character devices.

What's new

This version includes the following improvements:

For information about previous versions, see the ChangeLog file included in the distribution.

Download

sync2cd is released under the GNU General Public License version 2. The current version can be downloaded here.

sync2cd requires at least Python 2.4, and provides an installer based on distutils. This means that installation to the default location (/usr) is as simple as:

   python setup.py install

To install sync2cd to a specific location, e.g. /usr/local, enter:

   python setup.py install --prefix=/usr/local

sync2cd has been tested with Python 2.4.3 on Linux. It might work with other version and platform combinations, but I'm pretty sure a POSIX platform is needed, so if you are trying to run sync2cd on Windows, you will probably need to install cygwin. If you were able to run sync2cd with another configuration, please drop me a line (and, if necessary, send me a patch ;-)

Usage

Basic usage information is output with sync2cd --help:

    $ sync2cd --help
    sync2cd 1.3   Incremental archiving tool to CD/DVD
    Copyright (C) 2007 Remy Blank

    Usage: sync2cd [commands] [options] config_file

    Commands: -c, --create               Create a new archive descriptor
              -g, --graft-list           Output a graft list for an archive
              -h, --help                 Show this text
              -p, --print                Print archive information
              -r, --restore              Restore from archives
              -s, --status               Print current synchronization status
              -y, --copy                 Copy files of an archive to destination

    Options:  -a N, --archive N          Operate on archive number N
              -b GLOB, --glob GLOB       Add glob GLOB to pattern list
              -d DIR, --destination DIR  Copy or restore into directory DIR
              -m N, --medium-size N      Set archive medium size to N
              -n CMD, --mounter CMD      Mount media using CMD for restore
              --sort HOW                 File sorting key (time or alpha)
              -v, --verbose              Be more verbose
              -x EXP, --regexp EXP       Add regular expression EXP to pattern list

Commands

Commands define what will be done and what will be output to stdout. Several commands can be specified at the same time, and will be executed in a sensible order (e.g. --create before --graft-list).

-c, --create
Create a new archive, containing the files from the inputs specified in the configuration file, up to the capacity of one archive medium. This will create a new archive descriptor, with the same base name as the configuration file and a numbered extension, starting at 1 if no descriptor exists yet. If archive descriptor compression is enabled, a .gz extension is added.

-g, --graft-list
Output a list of graft-points to stdout, in the format expected by mkisofs with the -graft-points option.

-h, --help
Show some basic usage information and exits.

-p, --print
Print information about an archive to stdout. If --verbose is also specified, the list of files contained in the archive is also output.

-r, --restore
Restore the items matching the patterns specified with --glob and --regexp, or all items if no pattern was specified.

-s, --status
Print the status for a backup set, i.e. the total size of all files that need to be archived. If --verbose is also specified, the list of files that need to be archived is also output.

-y, --copy
Copy the files of an archive from their source location to the directory specified by --destination. The folder structure of the source is kept in the destination folder. File attributes (ownership, permissions, times) are not preserved. This command allows e.g. backing up onto a harddrive or a DVD-RAM, or splitting a folder structure into smaller chunks.

Options

Options allow passing arguments to the selected commands. If the same option is specified on the command line and in the configuration file, the command line takes precedence.

-a N, --archive N
Specify that the selected command(s) should be executed on archive N. Note that this option will have no effect with --create and will be overridden by the newly created archive.

-b GLOB, --glob GLOB
Add the shell pattern GLOB to the list of patterns for items to restore. The pattern format is specified in the Exclude() configuration file function description. If the pattern matches a directory, all items below it will be matched as well.

-d DIR, --destination DIR
Specify the destination directory for a copy or restore operation. All files and directories will be copied or restored below this directory. It must exist before the operation.

-m, --medium-size
Set the maximum size of an archive to N. Corresponds to the function MediumSize() in the configuration file.

-n CMD, --mounter CMD
Specify the command to be executed to mount a backup medium for restoring. It will be called with the path to the archive descriptor as an additional command-line parameter, and must print the mount point of the backup medium to stdout. Additionally, it will be called after the restore operation with a "dummy" descriptor with number 0 to allow ejecting the last media. For an example, see the sync2cd_mounter.sh script provided with sync2cd.

--sort HOW
Specify how files should be selected for inclusion when creating an archive, in the case where not all files would fit. Corresponds to the function Sort() in the configuration file.

-v, --verbose
Output more information to stdout for various commands.

-x EXP, --regexp EXP
Add the regular expression EXP to the list of patterns for items to restore. If the regular expression matches a directory, all items below it will be matched as well.

Examples

Here are a few basic examples:

Configuration file format

The configuration file is actually Python code calling functions defined in sync2cd.py and passing configuration information. The functions available are described below.

BaseDir(path)

Set the current working directory to path before starting. All paths specified with Input() are relative to this directory. This option corresponds to the -C or --directory option of tar.

Default:. (current directory)
Example:BaseDir("/home")

Compress(arg)

Specify if archive descriptors should be compressed, and which compressor to use. If arg=False, descriptors will not be compressed. If arg=True, a default compressor will be selected (currently bz2). arg can also be the name of the compressor to use (gz, bz2).

Default:bz2
Example:Compress("gz")

Discarded(archiveNo, ...)

Mark one or more archives as discarded. Files that were contained in the given archives will be included in the next archive creation operation, as if the archives had never existed.

Default:none
Example:Discarded(2, 4, 5)

Exclude(pattern)

Exclude files matching the shell pattern pattern from the archive. Several exclude patterns can be specified. The pattern matching is done against the path relative to BaseDir(). If a directory matches an exclude pattern, it is not recursed into.

As usual with shell patterns, a * wildcard matches zero or more characters except path separators (e.g. "/" on *nix). A new wildcard, **, matches zero or more characters, including path separators.

Example:ExcludeGlob("Music/Country/**.mp3")

This excludes all mp3 files in Music/Country and in all subdirectories.

ExcludeRegexp(pattern)

Exclude files matching the regular expression pattern from the archive. Several exclude patterns can be specified. The pattern matching is done against the path relative to BaseDir(). If a directory matches an exclude pattern, it is not recursed into.

For more information about regular expression syntax in Python, see this page.

Example:ExcludeRegexp("Music/Country/.*\\.mp3")

This excludes all mp3 files in Music/Country and in all subdirectories (note escaping of "\").

HashFunction(hash)

Specify the hash function to be used to check files for content modification. Currently supported: md5 (128 bits), sha1 (160 bits).

Default:md5
Example:HashFunction("sha1")

Input(path)

Add a file or directory to be archived. Several inputs can be specified. The use of a directory name always implies that the subdirectories below should be included in the archive. path must be a relative path specification, and is interpreted relative to BaseDir().

Example:Input("Music")

MediumSize(size)

Set the maximum size of an archive to size. This is typically used to span a backup over multiple media.

size is an integer giving the size in bytes, or a string containing a floating-point value optionally followed by the suffix k, M, G, T, P, E.

Default:0 (no limit)
Example:MediumSize("4.2G")

Sort(sort)

Specify how files should be selected for inclusion when creating an archive. sort can be either time (the default) or alpha.

When creating a new archive, the list of files to be included is sorted according to this criterion, either by modification time if time was specified, or by path name if alpha was specified. Then, files are selected for inclusion starting at the top of the list, until they fill one medium. The remaining files are left for a subsequent creation run.

In other words, time stores the oldest files first, and alpha keeps files more or less together (by directory).

Default:time
Example:Sort("alpha")

Here are a few examples of configuration files.

Frequently asked questions

To do

The following features will be added to sync2cd as time permits.

Feedback

If you are using or trying to use sync2cd, I would be happy to hear from you! I'm especially interested in the following:

In any case, just drop me an e-mail.

Links

 


Copyright (C) 2007 Remy Blank