[Box Backup] Suggested change in behaviour
Ben Summers
boxbackup@fluffy.co.uk
Tue, 21 Sep 2004 10:35:55 +0100
On 21 Sep 2004, at 10:27, richard_eigenmann wrote:
> That's an interesting point that you use inodes. I had the mental
> image that
> you use hash codes of files or perhaps hash codes of blocks of files to
> determine what needs to be backed up (for no good reason of course).
>
> The use of inodes explains why I have seen whole directories of
> pictures
> being re-backed up after I copied them around and deleted the original
> location after I was happy that everything had been rearranged
> properly.
When a file is being updated, it does uses hashes of blocks of data
within that file to determine what has changed. So if you have a
multi-mb database, but only change one entry, only a small bit of data
will be uploaded. However, this is only within single files. Hence the
behaviour you saw.
>
> The Netstore software I used previously must have had some sort of
> hash code
> over parts of files because it built up a rather large table of data.
> It
> wouldn't complain too bitterly about that table being wiped but would
> spend
> time and perhaps bandwith to re-create it.
I did consider a more global hash based approach, but it limited what I
could do with the encryption, and quite frankly, was a little scary
when it came to data integrity. Hashes are all very well and good, but
probability being what it is you're going to get a collision at some
point.
Ben