[Box Backup] System Backups?

Ben Summers boxbackup@fluffy.co.uk
Mon, 6 Dec 2004 11:03:47 +0000

On 1 Dec 2004, at 18:12, Joe Krahn wrote:

> Ben Summers wrote:
> ...
>> You can backup your root if you exclude the mounted directories, and 
>> then back those up as separate backup locations. The problem is that 
>> your backup locations shouldn't contain more than one filesystem, as 
>> it stops the renamed files and directory tracking working.
>> I really must write the code which detects when you've got it wrong 
>> and complains, so I can avoid repeating this. :-)
> It's probably a good idea (and maybe not too hard) to follow the 
> behavior of the tar and rsync --one-file-system flag. A really bad 
> mistake here is accidentally following an NFS path.

Unfortunately implementing this requires different code on almost every 
platform, which is annoying. It's not going to be easy or clean to do.

> ...
>>>  (An advanced version of this could take advantage of redundancy 
>>> across systems, where the level zero backup is also a diff against 
>>> some reference system).
>> This sounds suspiciously like having a standard image which can be 
>> recreated automatically, and then only the user data is backed up. 
>> This is my preferred way of backing up systems, although it's less 
>> applicable to people with only one machine.
> Well, sort of, but it allows for system variations from the 
> 'reference' system. For example, not all systems need and SQL server. 
> BUT, I'm guessing that this doesn't fit the boxbackup design, because 
> redundancy would best be handled at the server level, but the server 
> only gets encrypted data.

It could be done by the server making available length + checksums of a 
list of known files, and the backup client referencing them when it 
finds a match.

> On the other hand, one could do things backup the list of installed 
> RPMS, and don't backup files that have not changed from the install 
> version.

Yes, indeed. But this is something which you'd do when integrating Box 
Backup into a bigger whole.

>>> 2) A utility for doing a full or partial restore from outside the 
>>> native OS (i.e. a mini-Linux recovery CD that can do full backups 
>>> and restores).
>> I don't think there would be any difficulty in creating such a thing. 
>> Box Backup is platform independent (as far as the build scripts 
>> allow) so should just be considered an element such a recovery CD 
>> could use.
>> My aim for the project is to provide the best backup _engine_, and 
>> then let others build it into the best _system_. I don't want to 
>> tackle more than I have time for.
> OK. If I invest much time on this, I'll try to focus on client-side 
> stuff.

Thanks -- any work is appreciated.

>>> 3) An idea I just thought of: user-land RAID across separate 
>>> servers. I think of what would happen if a RAID server caught on 
>>> fire and burned multiple disks. That's why I like the idea of using 
>>> RAID for backups, rather than trying to set up a live RAID to run 
>>> on.
>> You can do this now. Simply have one of the three RAID file systems 
>> on each server, and NFS mount the other two. Configure the bbstored 
>> daemons identically, and use round-robin DNS. You'll need fast 
>> network links between the servers though.
>> The other alternative is to use rsync on a regular basis to copy the 
>> encrypted store to a backup server.
> I guess I was thinking more of groups of home users being able to 
> share encrypted RAID storage across each other's disks, and also be 
> able to take advantage of RAID not costing 200% disk space as does a 
> full rsync copy.

Oh, I see. That could be possible, but there are efficiency issues if 
it was just put into the system without a bit of reworking.

>>> 4) don't use just "box" for boxbackup related filenames because it's 
>>> not very unique or descriptive.  -- /etc/boxbackup is better than 
>>> /etc/box.
>>> Actually, I would probably modify the project name to include 
>>> something about disk-based, redundant, secure backup. One could make 
>>> a nice acronym from "Scabbard", for example; more unique but also 
>>> not very descriptive.
>> There are other projects, all named Box <Something>. Although I admit 
>> that naming things is not my strong point.
> Is the existence of other "Box <something>" projects an argument in 
> favor or against Box Backup? And can we really apply the term 'Box' to 
> Windows systems? I think the term 'Sack' fits better.
> Maybe I can set up a Recovery disk using 'Scabbard' for the 
> backup/recovery idea, something like:
> System <something> Box Backup And Recovery Disk

At the moment, the code is more important to me than the name! I 
realise it's not optimal.

> ...
>> Right now what I want to do next is
>> * Modify the stored files
>>   - add multiple streams for Mac OS X and Win32
>>   - support sparse files
>>   - include stronger cryptographic checksums to detect corruption
>> * Sort out the ports
>>   - debug and include Solaris
>>   - integrate native Win32
>>   - finish the Mac OS X port by backing up resource forks (I want to 
>> use it!)
>> * Emulate tapes
>>   - being able to efficiently mark multiple states of the file system 
>> as a check point
>>   - being able to restore from any checkpoint
> Maybe with tape emulation, it would be possible to make Box Backup 
> back-ends to work with existing tools like Amanda or Bacula in a 
> reasonable way, for people who've done a lot of work getting the 
> current system optimized, but want to switch to RAID from the dreaded 
> world of tapes.

It's possible, depending on how these systems work.

>> * Improve system managements
>>   - see spec posted and commented in the list archive
>> * Bash through my list of feature requests
>> and I think then the engine will be basically complete, and I can 
>> then move on to looking at other aspects of the system like user UI.
> Maybe I'll have some time to do a bit of work on user tools.

Make sure you post on the list first to avoid duplicating other work. 
There are a few other efforts underway.