[Box Backup] Network failure

Mr R G Shepherd boxbackup@fluffy.co.uk
Sun, 06 Feb 2005 13:51:45 +0000


Ben Summers wrote:
> 
> On 6 Feb 2005, at 12:27, Mr R G Shepherd wrote:
>> Has anybody had experience with a network outage or connectivity blip
>> during a backup? can the server side be corrupted easily?
> 
> 
> Since backups are potentially long operations, the server is written to 
> expect problems to occur.

Excellent.

> 
> Every file write is performed in a transactioned manner, so either the 
> file is written successfully or the store is not changed at all. Also, 
> where entities are written lazily to increase performance, there is 
> recovery code to cope with this data being out of date. (And all 
> important data is committed immediately it is completely available.)
> 

Nice.

> If the worst should happen, there is a utility to check a store and 
> rebuild it. This will make all valid data in the store available, 
> whether it is in the directory structure or not.

Neat.

> 
> The client is written to be conservative about assuming server state, 
> and will re-query and compare if it detects a problem.

Cool.


> I suggest you give it a go! Set a long backup going, then at random 
> intervals kill the server and client, and pull out network cables. Then 
> report back to the list.

indeed. :)

It goes without saying, I will check and test to death before I go for production.
It's nice to see however the design is accurate enough to realise such problems.

There is such a lot of backup systems available It's hard to decide which to try 
first, let alone delve into source to figure out the resilience.....

I will begin my testing soon however

I would like to know if it possible to disable recovery? not completely however. :)

My reasoning is, given the ease of automated backups and recovery, my users 
would get complacent and treat it like a fully dependable second hard disk, safe 
in the knowledge they "always" have backup copy which can be restored. this 
leads to versioning errors and lost work. (if they restore a dir and not an 
individual file by mistake) which leads to the system being more trouble than 
it's worth. :)

I would like to be able to describe it as a one way process. To describe it as a 
system where recovery is a last option. I could then temporarily permit a 
recovery per client in the event of absolute failure. :)

Most of my users have such a slack un-modifiable attitude when it comes to 
data/file/disk management. It's hard enough trying to coax them from saving 
important work in "completely arbitrary" areas of any available storage :)

Keep up the good work.

Rob