[Box Backup] Hard Links and rdiff

John Goerzen boxbackup@fluffy.co.uk
Sat, 12 Apr 2008 21:28:16 -0500


Hi folks,

I've been looking for a new backup program lately, and from the
description on the web site and wiki, Box Backup looked like the
perfect fit.

Until I started doing some tests.  I noticed that Box Backup has
completely mangled all hard links.  I can do a backup, and when I do a
restore, it unpacks a separate copy of each file that was hard linked
together.  This is a showstopper for me.  In addition to making Box
Backup unsuitable for backing up the entire system (due to hardlinks
in /bin, /sbin, /usr/bin, etc.), it also steps on the toes of people
that use distributed version control systems like Git, Mercurial, or
Darcs.  Is there a configuration option somewhere to preserve hard
links?  Are there any other POSIX attributes that Box Backup may
unexpectedly not restore?  Does it preserve mtime, atime, ctime,
symlinks, block and character devices, FIFO locations, EAs, and ACLs?

Also, I tried in vain to find some details on its rdiff algorithm.
Does it play rdiffs "backwards" like rdiff-backup, or forwards like
duplicity?  In the "backwards" scheme, the "full" data is always the
most current, and you follow a chain of deltas backwards to older
versions.  This makes restores of recent versions easy, and also
allows removal of the oldest data without breaking chains.  duplicity
goes the other way -- taking full backups, and storing binary deltas
going forward.  This removes the need to keep an accessible copy of
the full recent data, at the expense of making restores of recent data
slower and imposing some limits on the removal of older data.

Is the rdiff algorithm applied on a per-file basis or on the entire
backup file as a whole?  If it's on a per-file basis, does it identify
a file by name or by device/inode number?

Thanks,

-- John