[Box Backup] BadBackupStoreFile
Johann Glaser
boxbackup@fluffy.co.uk
Mon, 03 Sep 2007 12:19:56 +0200
Hi!
> OK, that's different :-) So was it the second bbackupquery that hung and
> left a hung process on the server, or the first?
Yes, but I can't say what the bbackupd process was for.
> I'm not sure that the large total volume is significant here, but large
> files certainly are (with 0.10). But I'm afraid there is no solution to
> that problem except to upgrade.
Mhm, I see. I'll take it into consideration.
> > 2) slow backup storage mounted via NFS (Iomega StorCenter 150d 2TB)
> > The backup usually starts at 3:00 and finishes between 10:00 and 12:00,
> > so it runs 7-9 hours for approx. 45GB.
>
> Do you know if it's IO bound or CPU bound on the client or server or
> whether it's the tape device that's slowing it down? How big are the usual
> incremental backups in volume of changed files? (you can get that from the
> statistics in syslog at the end of each backup run)
The Iomega StorCenter is a NAS, i.e. one large hard disk (to be precise:
RAID5) offered via NFS. The limiting factors are both, CPU and IO. When
looking at the processes with "top" bbackupd sometimes utilizes nearly
100% CPU, and sometimes only very few. bbstored also sometimes utilizes
approx. 20% CPU.
Syslog says (for the last few days, I've added the thousand-dots, for
Aug 31 there is no "Connection statistics" line):
Aug 26 11:10:19 clara bbstored[20310]: Connection statistics for BACKUP-1: IN= 590.499.340 OUT=57301972 TOTAL= 647.801.312
Aug 26 11:10:19 clara bbackupd[ 1693]: File statistics: total file size uploaded 9.766.762.140, bytes already on server 8.763.266.143, encoded size 588.573.414
Aug 27 13:06:39 clara bbstored[18602]: Connection statistics for BACKUP-1: IN=3.308.622.563 OUT=66123209 TOTAL=3.374.745.772
Aug 27 13:06:39 clara bbackupd[ 1693]: File statistics: total file size uploaded 14.685.054.411, bytes already on server 10.514.171.221, encoded size 3.306.372.675
Aug 28 13:35:49 clara bbstored[16211]: Connection statistics for BACKUP-1: IN=3.143.079.310 OUT=67817968 TOTAL=3.210.897.278
Aug 28 13:35:49 clara bbackupd[ 1693]: File statistics: total file size uploaded 15.233.073.620, bytes already on server 11.229.233.942, encoded size 3.140.764.015
Aug 29 13:39:58 clara bbstored[14071]: Connection statistics for BACKUP-1: IN=2.674.544.582 OUT=66673452 TOTAL=2.741.218.034
Aug 29 13:39:58 clara bbackupd[ 1693]: File statistics: total file size uploaded 13.968.680.100, bytes already on server 10.875.442.949, encoded size 2.672.245.491
Aug 30 14:08:59 clara bbstored[14387]: Connection statistics for BACKUP-1: IN=2.318.091.276 OUT=73106275 TOTAL=2.391.197.551
Aug 30 14:08:59 clara bbackupd[ 1693]: File statistics: total file size uploaded 14.853.920.326, bytes already on server 11.349.955.798, encoded size 2.315.752.024
Aug 31 10:15:01 clara bbackupd[ 1693]: File statistics: total file size uploaded 12.023.532.073, bytes already on server 7.305.642.825, encoded size 3.782.262.988
Sep 1 15:30:43 clara bbstored[10298]: Connection statistics for BACKUP-1: IN=4.799.606.979 OUT=70868153 TOTAL=4.870.475.132
Sep 1 15:30:43 clara bbackupd[ 1693]: File statistics: total file size uploaded 17.103.473.401, bytes already on server 10.853.748.045, encoded size 4.797.115.905
Sep 2 12:45:06 clara bbstored[ 4170]: Connection statistics for BACKUP-1: IN=2.204.990.690 OUT=51000940 TOTAL=2.255.991.630
Sep 2 12:45:06 clara bbackupd[ 1693]: File statistics: total file size uploaded 11.585.981.608, bytes already on server 7.978.337.448, encoded size 2.203.395.290
Interestingly, it finds ~>10GB every day to be "new" but there should
only be <<1GB. I suspect the low diff time, which I've now increased to
20.000 seconds. :-)
> I suspect that because your MaximumDiffingTime is short, you are backing
> up complete copies of your large files each time, therefore writing nearly
> 45GB each time. 45GB in 9 hours is about 1.4 MB/s which is not
> unreasonable for a tape device in my opinion (what is the spec for that
> device?)
The StorCenter is attached via a 100MBit LAN. We tested with a 1GB LAN,
but the write-speed for large files didn't go above 13MByte/sec. Its
main problem is accessing many small files where it seems to have
additional delay for every single file.
BTW: The Sesam backup needs 7 minutes for "daily" backups (differential)
but also has to check a similarely large amount of data.
Bye
Hansi