[Box Backup] Unknown errors

Chris Wilson boxbackup@boxbackup.org
Mon, 9 Feb 2009 23:48:32 +0000 (GMT)


Hi Torsten,

On Thu, 5 Feb 2009, Torsten wrote:

> after a long test period here are my test results.
>>
>> Here is the problem. For some reason, bbackupd was able to open
>> /boxmount/boxmount-SERVERHH/testfile.dat, but not able to read from it.
>> This was not an expected situation; normally the open would fail if there
>> was a permissions problem. Can you copy that file to a local file on
>> "matrix" without errors, when running as the bbackupd user? What kind of
>> filesystem is it, and if it's a network mount, what is the remote OS?
>
> The error occurs only on cifs mounted exports. But it occurs only sometimes
> and i could not reliable reproduce it.
>
> Some lines before the FileDoesNotVerify message from box backup i found the
> following in the syslog:
>
> Feb  4 12:03:26 boxbackup kernel: Status code returned 0xc0000054
> NT_STATUS_FILE_LOCK_CONFLICT
> Feb  4 12:03:26 boxbackup kernel:  CIFS VFS: Send error in read = -13
>
> After some searching with google i found this patch for backuppc. Backuppc
> also mounts windows exports and backes them up.
>
> http://osdir.com/ml/sysutils.backup.backuppc.general/2004-09/msg00051.html
>
> I do not really unterstand this problem. But i think
> NT_STATUS_FILE_LOCK_CONFLICT should just not be rated as an error?!?

I had a look at the patch to BackupPC and it appears to be related to 
clients running on Windows. While I agree with your assessment, 
unfortunately CIFS appears to return error 13, which is EACCES (Access 
Denied) to the client (running on a UNIX system).

The only way that I can think of to distinguish this error from the usual 
Access Denied, when a file is locked or permissions prevent access by the 
client, is that in this case it appears to occur while reading the file, 
rather than opening it.

In this case, I suspect that we should skip the file with an appropriate 
error message, and continue the backup. As the client will not allow us to 
read the file, this appears to be the only possible course of action 
(apart from aborting the run due to an unexpected error).

The problem is that we were able to open the file and have already started 
sending data to the server. It's not clear to me right now what we can do 
to abort the upload once it's already in progress, without killing our 
connection to the server (perhaps that's the best thing to do).

It's probably difficult to reproduce the error because it only happens 
when the client changes the state of the file (from unlocked to locked) 
during the process of backing up that file. Otherwise, we would have 
failed while calculating the patch, before starting the upload, and the 
upload would never have been started.

Does this appear to make sense to you? Would it be reasonable to abort the 
backup run if this happens, and try to backup the file again on the next 
run?

>> I'm pretty confused about the basic_string::substr error. According to 
>> the code this should be impossible. Please could you build a clean copy 
>> of the latest snapshot or trunk on this machine and try running it 
>> instead of r2237?
>>
>> Given that an error was reported about Outlook.pst just before that, I
>> suspect that a similar condition may be happening here as the one above.

You mean a protocol synchronisation error? I still don't understand how 
this could cause a basic_string::substr error, which should be impossible 
in any case.

> After an update to svn2409 i saw this error only once and i could not 
> repoduce it. But i will keep an eye on this.

Please do, thanks for your help.

Cheers, Chris.
-- 
_____ __     _
\  __/ / ,__(_)_  | Chris Wilson <0000 at qwirx.com> - Cambs UK |
/ (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Ruby/Perl/SQL Developer |
\__/_/_/_//_/___/ | We are GNU : free your mind & your software |