[Box Backup] Development plan

Ben Summers boxbackup@fluffy.co.uk
Thu, 8 Apr 2004 10:30:15 +0100


On 7 Apr 2004, at 20:14, Pascal Lalonde wrote:

> On Wed, Apr 07, 2004 at 12:01:24PM +0100, Ben Summers wrote:
>>
>> * Do a little more work, adding a few new features and complete some 
>> of
>> the "TODO" items (see the TODO.txt file in the distribution).
>>       - Would a "maximum file size to backup" option be useful?
> Maybe, but I don't see any use for me. I'd be afraid to use it, just in
> case I have a very important big file which doesn't get backep up
> because I set a limit during the installation (and of course, I forgot
> about that limit).

That would be the worry. It might be better addressed by using regex 
excludes for known file types. But it could be useful, maybe?

>
>> * Maybe some more work on the NetBSD, FreeBSD and Darwin ports?
> I would've thought Linux is more of a problem than these... Lack of
> testers?

Linux is more of a problem, definitely. With FreeBSD and NetBSD, there 
are just minor little issues which need looking at (which shouldn't 
affect normal use, but makes the tests fail). However, with Darwin 
there needs to be a lot of stuff done to cope with the funny Macintosh 
resource forks.

I don't know of anyone running it on FreeBSD or NetBSD.

>
>>
>> * Release 0.07 (or 1.00?) and advertise it as "stable", ready for
>> production use.
>>
> There is one thing that I think would be really important before making
> it a 1.0 production release (at least, in my opinion), which is:
> tagging.

Good point. I'm calling this "marking", and forgot all about it when 
writing this plan.

>
> For example, suppose I wish to keep a month behind, and I need to be
> able to recover the data exactly as it was 20 days ago (for example, we
> have a big database, which got corrupted and since this specific table
> is rarely used, we noticed 20 days later). Of course, BoxBackup doesn't
> can't distinguish corruption from normal changes, so the current 
> version
> in the backup store is corrupted too. In order to be able to get back 
> to a
> non-corrupted state, I need to ensure that:
> 1) the older files are still there
> 2) I can recover ALL of the database's related files for that specific
>    date
>
> What I could do is tag the store each day, and then I can go back any
> day. But for this to work, tagged files must not be erased during
> housekeeping, so maybe there could be a flag to indicate bbstored to 
> not
> erase files associated with a specific tag until the flag is removed
> (like once a month, after dumping the data to physical media).

My plan is to define a set of marks. The current state of files can be 
marked, either automatically by bbackupd when it logs in at certain 
times, or manually by bbackupctl.

In bbackupquery, you'll be able to list the current marks, and then 
switch to a "view" of that mark. When viewing a mark, you traverse (and 
restore from) the state of the store when that mark happened.

>
> This could be a lifesaver in case someone accidentally erases a file,
> especially if the store is almost full (the file marked as deleted
> would soon get cleaned by bbstored's housekeeping).

Absolutely. It is required, especially as one of my aims is to emulate 
the behaviour of a sensible tape backup scheme.

>
> I don't know how the soft limit should deal with such tags. Maybe it
> would be appropriate to create more limits, like maximum history size,
> or dedicating a % of quota to history only.

Here's a history of a file:

1 -- mark 1
2 -- mark 2
3
4
5 -- mark 3
6
7 -- mark 4
8
9 -- current version

The files will be deleted in this order: 3, 4, 6, 8, 1, 2, 5, 7

That is, it will delete older versions first which aren't marked, then 
marked versions, oldest mark first. Within all files, it will delete 
the oldest files in these lists until it has cleared enough space.

I suppose it will be worth having a "keep n marked versions" option, so 
you can keep the current file + n older versions. But then you couldn't 
treat weekly marks differently to daily marks. I'm sure I'll think of 
something nice and simple.

But then, maybe you'd just let daily backups happen, and only mark the 
weekly states?


I will add marking into 0.06... or maybe finish everything else off, 
and put it in 0.07? Depends on how complex it is, I suppose.

>
> The other features I already spoke of in previous posts are mostly toys
> I would like to see (like not storing different versions of a file in
> full, just keep the differences), but I don't think they're critical 
> for
> a 1.0 release. As you say, hard drive space is cheap nowadays.

I do think this is some way in the future. At the moment I'm more 
interested in correctness than efficiency. (But that's not to say that 
I have written it with efficiency in mind.)

>
>>
>> I trust people are keeping an eye on their backups, and running
>> verifications occasionally...
>>
>>    /usr/local/bin/bbackupquery "compare -a" quit
> Everything looks good.

I'm glad to hear it! :-)

Ben