[Box Backup] I have the fear....

Ben Summers boxbackup@fluffy.co.uk
Tue, 13 Mar 2007 22:34:36 +0000


On 13 Mar 2007, at 22:21, Chris Wilson wrote:

> Hi Ben,
>
> On Tue, 13 Mar 2007, Chris Wilson wrote:
>
>> The problem is that both ._cfg0000_host.conf and host.conf had  
>> exactly the same modification time before the update tool ran, and  
>> host.conf had the same modification time afterwards. Because the  
>> modification time of host.conf did not change between backup runs,  
>> Box Backup assumed that the file had not changed, and skipped it.
>>
>> I would consider it a bug in Gentoo that it touched host.conf  
>> without changing it, and therefore it ended up with the same  
>> timestamp as ._cfg0000_host.conf.
>>
>> We could fix this by comparing the inode numbers with our cached  
>> store as well, or perhaps even doing a full compare on the file,  
>> although the latter would probably slow down backups a lot (since  
>> every file which appears not to have been changed would have to be  
>> compared).
>
> Except that we don't track inode numbers on small files. Duh.
>
> That seems to leave three reasonable choices:
>
> 1. We get gentoo fixed to not touch the original file at the same  
> time as
>    the update.
>
> 2. You reduce your file tracking size threshold, and we fix Box  
> Backup to
>    check inode numbers.
>
> 3. Both of the above.
>
> (ruling out full compare of apparently unchanged files).
>
> Opinions, anyone?

If we included hashes of file contents, it'd sort it. But then you'd  
have to calculate hashes for everything when you scan. We don't check  
file sizes though. That would make it less likely to miss things, at  
the cost of included them encrypted in the directories.

So really there's no real way of catching this case without a massive  
amount of book keeping and inflating the structures.

How about

4. Wait until the next version when we intend to use filesystem  
change notifications instead of scanning, where available. For now,  
document this edge case.

Changing anything now seems an awful lot of extra work to work around  
non-optimal (or dangerous?) behaviour in an obscure Linux  
distribution with dubious project goals.

Ben