[Box Backup] Suggested change in behaviour

Ben Summers boxbackup@fluffy.co.uk
Tue, 21 Sep 2004 10:35:55 +0100


On 21 Sep 2004, at 10:27, richard_eigenmann wrote:

> That's an interesting point that you use inodes. I had the mental 
> image that
> you use hash codes of files or perhaps hash codes of blocks of files to
> determine what needs to be backed up (for no good reason of course).
>
> The use of inodes explains why I have seen whole directories of 
> pictures
> being re-backed up after I copied them around and deleted the original
> location after I was happy that everything had been rearranged 
> properly.

When a file is being updated, it does uses hashes of blocks of data 
within that file to determine what has changed. So if you have a 
multi-mb database, but only change one entry, only a small bit of data 
will be uploaded. However, this is only within single files. Hence the 
behaviour you saw.

>
> The Netstore software I used previously must have had some sort of 
> hash code
> over parts of files because it built up a rather large table of data. 
> It
> wouldn't complain too bitterly about that table being wiped but would 
> spend
> time and perhaps bandwith to re-create it.

I did consider a more global hash based approach, but it limited what I 
could do with the encryption, and quite frankly, was a little scary 
when it came to data integrity. Hashes are all very well and good, but 
probability being what it is you're going to get a collision at some 
point.

Ben