[Box Backup] Problem with 'get' command on large files

Nick Knight boxbackup@fluffy.co.uk
Wed, 9 Aug 2006 11:08:35 +0100


I had this, this is a large file issue (which is why I have put this in
the testing part of the release). It took about a week to find the bug -
which is below and waiting for me to put into svn:

Bug 1.=20

In lib/backupclient/BackupStoreFileEncodeStream.cpp line 270:

	} while(rBlockSizeOut <=3D (BACKUP_FILE_MAX_BLOCK_SIZE/2) &&
rNumBlocksOut > BACKUP_FILE_INCREASE_BLOCK_SIZE_AFTER);
	//BUG - Nick Knight 24-07-2006
	//Not sure whether this is the fix or the backupstorefile.cpp is
the problem, they
	//certainly do not match...
	//added divide by 2.

This on the client created blocks which the restore did not like the
block size (this could also be fixed on the restore - but it appeared to
be sensible to fix it here.

Bug 2.

In /lib/common/ReadGatherStream.cpp


                // Anything in the current block?
                if(mPositionInCurrentBlock <
mBlocks[mCurrentBlock].mLength)
                {
                        // Read!
                        pos_type s =3D mBlocks[mCurrentBlock].mLength -
mPositionInCurrentBlock;
                        if(s > bytesToRead) s =3D bytesToRead;

                        int r =3D
mComponents[mBlocks[mCurrentBlock].mComponent]->Read(buffer, s,
Timeout);

The pos_type s =3D mBlocks[mCurrentBlock].mLength -
mPositionInCurrentBlock; var s was an int, when the server upt a single
large file together, it places the whole file (data) as a single block
into the object, which means any read should be the same addressable
space.


-----Original Message-----
From: boxbackup-admin@fluffy.co.uk [mailto:boxbackup-admin@fluffy.co.uk]
On Behalf Of Per Thomsen
Sent: 09 August 2006 10:57
To: boxbackup@fluffy.co.uk
Subject: [Box Backup] Problem with 'get' command on large files

Hi,
I have been awfully busy, and have only now started migrating my old=20
accounts to my new server (Linux X86_64, FC5, 0.10), and I'm running=20
into a problem when doing some manual testing.

To wit: My client machine is a Mac OSX machine (10.4.7) with the 0.10=20
release builds of bbackupd, bbackupquery, etc. It appears that the=20
backups work fine. I am able to (AFAICT) back up a 3.2GB file to the=20
store. But when I try to restore it (for testing), the following happens

in bbackupquery:

-------------------------------
curie:/Users/pthomsen root# bbackupquery
Box Backup Query Tool v0.10, (c) Ben Summers and contributors 2003-2006
Using configuration file /etc/box/bbackupd.conf
Connecting to store...
Handshake with store...
Login to store...
Login complete.

Type "help" for a list of commands.

query > cd users/pthomsen/Desktop/bordeaux-DVD-i386
query > ls -st
00032547 f----- 2006-04-01T00:35:58 3079421 FC-5-i386-DVD.iso
00033403 f----- 2006-06-06T00:29:09 00001 SHA1SUM
000375c3 f----- 2006-08-08T03:39:53 00001 .DS_Store
00037666 f----- 2006-08-08T07:26:15 3079421 p.iso
query > get FC-5-i386-DVD.iso foo.iso      =20
Error occured fetching file.
query > ls
Exception: Connection TLSReadFailed (Probably a network issue between=20
client and server.) (7/34)

-------------------------------

Even with a debug build there is nothing in system.log on the client=20
machine.

On the server, I get the following in the (extended logging;=20
release-build) logs:
-------------------------------
Aug  9 01:47:01 backup01 bbstored[19717]: Sending stream, size 256
Aug  9 01:47:01 backup01 bbstored[19717]: Receive
GetFile(0x31e85,0x37666)
Aug  9 01:47:01 backup01 bbstored[19717]: Receive
GetFile(0x31e85,0x37666)
Aug  9 01:47:01 backup01 bbstored[19717]: Send Success(0x37666)
Aug  9 01:47:01 backup01 bbstored[19717]: Send Success(0x37666)
Aug  9 01:47:01 backup01 bbstored[19717]: Sending stream, size
-1141640418
Aug  9 01:47:01 backup01 bbstored[19717]: Connection statistics for=20
BACKUP-0005: IN=3D210 OUT=3D75545 TOTAL=3D75755
Aug  9 01:47:01 backup01 bbstored[19717]: in server child, exception=20
RaidFile OSError (Error when accessing an underlying file. Check file=20
permissions allow files to be read and written in the configured raid=20
directories.) (2/8) -- terminating child=20
-------------------------------

I noticed the negative number for stream size, and worry that it is=20
there because something in my builds is not using 64-bit numbers for=20
file size. The file I'm trying to restore (id 00032547) is 3,153,326,894

bytes in the store, which (when 16 is subtracted from it) is=20
-1,141,640,418 as a signed 32-bit number.

I am suspicious that I have not built the large-file support correctly.=20
I vaguely remember that there was something special needed to get the=20
large file support to build on Linux, but I thought it was taken care of

with autoconf.

Or this could be a bug in the GetFile command.

In lib/server/Protocol.cpp:738: IOStream::pos_type is used. I can't find

the definition of IOStream::pos_type, and I'm not sure if that's really=20
the right place to be looking for this.

I think I built the server with 64-bit support, and I'm attaching a=20
gzipped config.log, so nothing (hopefully is missing)...

Thanks for any help I can get,
Per

--=20
Per Reedtz Thomsen | Reedtz Consulting, LLC | F: 209 883 4119
V: 209 883 4102    |   pthomsen@reedtz.com  | C: 209 996 9561
GPG ID: 1209784F   |  Yahoo! Chat: pthomsen | AIM: pthomsen