[Box Backup] BadBackupStoreFile

Chris Wilson boxbackup@fluffy.co.uk
Mon, 3 Dec 2007 22:15:14 +0000 (GMT)

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

Content-Type: TEXT/PLAIN; CHARSET=utf-8; FORMAT=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE
Content-ID: <Pine.LNX.4.64.0712032141071.17106@top.qwarx.com>

Hi Johann,

Apologies for the late reply, I'm catching up on some unfinished emails in=
my outbox that have been languishing for far too long.

On Mon, 10 Sep 2007, Johann Glaser wrote:

>>> I suspect the low diff time, which I've now increased to 20.000=20
>>> seconds. :-)
>> That may be a bit too long! I wonder how long the diffs take in=20
>> practice? If we're constantly re-reading files then we're certainly=20
>> wasting a lot of time in diffing.
> Why should it be too long?

Each file will be diffed for up to 20,000 seconds, nearly 6 hours. If you=
have a number of large files that really can take 6 hours to diff each,=20
then your backup will run... really... slowly.

> The re-reading of files is indeed a strange thing. Does this make any=20
> sens?

Yes, it does kind of make sense. The file on the server can easily have a=
mix of block sizes, and we need to calculate whether we have matches for=20
any of those blocks. Currently we can only search for matches using one=20
block size at a time, so we have to re-read the file once for each block=20
size (up to 128 times, I think, in the worst case).

It would be possible to optimise this to calculate checksums and find=20
matches for all block sizes in a single pass, but nobody has done that=20
work yet.

> I could imagine that there are more people who consider to user=20
> BoxBackup together with an external NAS. I suggest that you add a=20
> warning to the BoxBackup homepage, that with certain models of "cheap"=20
> NAS boxes (~ =E2=82=AC 1000,--) performance issues were observed. I would=
> propose to direct people to external USB harddisks if (and only if) they=
> want external storage devices.

Noted on [http://www.boxbackup.org/trac/wiki/ConfiguringAServer].

> Regarding modifying BoxBackup to tolerate slow storage: Changing its
> storage format should be avoided. :-)

But may be necessary, unfortunately. Large files tend to perform better=20
than small ones, especially for streaming and in reducing disk overhead,=20
and if we want to support Amazon S3 then it seems to be the way to go.

Cheers, Chris.
_____ __     _
\  __/ / ,__(_)_  | Chris Wilson <0000 at qwirx.com> - Cambs UK |
/ (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Perl/SQL/HTML Developer |
\ _/_/_/_//_/___/ | We are GNU-free your mind-and your software |