[Box Backup] BadBackupStoreFile
Mon, 3 Dec 2007 22:15:14 +0000 (GMT)
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
Content-Type: TEXT/PLAIN; CHARSET=utf-8; FORMAT=flowed
Apologies for the late reply, I'm catching up on some unfinished emails in=
my outbox that have been languishing for far too long.
On Mon, 10 Sep 2007, Johann Glaser wrote:
>>> I suspect the low diff time, which I've now increased to 20.000=20
>>> seconds. :-)
>> That may be a bit too long! I wonder how long the diffs take in=20
>> practice? If we're constantly re-reading files then we're certainly=20
>> wasting a lot of time in diffing.
> Why should it be too long?
Each file will be diffed for up to 20,000 seconds, nearly 6 hours. If you=
have a number of large files that really can take 6 hours to diff each,=20
then your backup will run... really... slowly.
> The re-reading of files is indeed a strange thing. Does this make any=20
Yes, it does kind of make sense. The file on the server can easily have a=
mix of block sizes, and we need to calculate whether we have matches for=20
any of those blocks. Currently we can only search for matches using one=20
block size at a time, so we have to re-read the file once for each block=20
size (up to 128 times, I think, in the worst case).
It would be possible to optimise this to calculate checksums and find=20
matches for all block sizes in a single pass, but nobody has done that=20
> I could imagine that there are more people who consider to user=20
> BoxBackup together with an external NAS. I suggest that you add a=20
> warning to the BoxBackup homepage, that with certain models of "cheap"=20
> NAS boxes (~ =E2=82=AC 1000,--) performance issues were observed. I would=
> propose to direct people to external USB harddisks if (and only if) they=
> want external storage devices.
Noted on [http://www.boxbackup.org/trac/wiki/ConfiguringAServer].
> Regarding modifying BoxBackup to tolerate slow storage: Changing its
> storage format should be avoided. :-)
But may be necessary, unfortunately. Large files tend to perform better=20
than small ones, especially for streaming and in reducing disk overhead,=20
and if we want to support Amazon S3 then it seems to be the way to go.
_____ __ _
\ __/ / ,__(_)_ | Chris Wilson <0000 at qwirx.com> - Cambs UK |
/ (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Perl/SQL/HTML Developer |
\ _/_/_/_//_/___/ | We are GNU-free your mind-and your software |