[Box Backup] New encryption format & Space usage

boxbackup@fluffy.co.uk boxbackup@fluffy.co.uk
Thu, 29 Apr 2004 09:04:52 -0400 (EDT)


> --__--__--
>
> Message: 1
> To: boxbackup@fluffy.co.uk
> From: Ben Summers <ben@fluffy.co.uk>
> Date: Wed, 28 Apr 2004 13:17:45 +0100
> Subject: [Box Backup] 0.05PLUS1
> Reply-To: boxbackup@fluffy.co.uk
>
>
> I've just uploaded a new preview version to
>
>    http://www.fluffy.co.uk/boxbackup/bin/boxbackup-0.05PLUS1.tgz
>
> This includes some slightly fundamental changes, but it's completely
> backwards compatible (and I spent quite a bit of time making sure it
> was.) So it's perfectly safe to install and use. However, 0.05PLUS1
> clients need a 0.05PLUS1 server (although 0.05PLUS1 servers are happy
> with earlier clients).
>
> This also includes a THANKS.txt file, which is long overdue. Since I
> foolishly haven't been writing it as I've gone along, I would
> appreciate gentle reminders if you should be on the list, but I've
> missed you off.
>
> Right then, changes:
>
> * add usage command to bbackupquery
>    - tells you how much space you're using on the server from the client
> side
>
> * bug fixes
>    - various
>
> * add delete [yes] command to bbstoreaccounts
>    - delete accounts as well as create them! Use the optional 'yes'
> argument to skip the confirmation question.
>
> * add check [fix] [quiet] command to bbstoreaccounts
>    - more on this later
>
> * Use AES for file data
>    - After long consideration, I think AES is a better choice of cipher
> for file data, mainly due to the larger block size. Blowfish is still
> used for meta data (as it's more space efficient with it's 64 bit block
> size), and Blowfish encoded files can still be restored.

Hi Ben,

How does it move the data forward?

To try and explain:

Does it convert the data when the client next connects, at server startup
time, does it just let the old (encryption) format data expire?

After re-reading your email it looks like it lets the old data expire, but
I was just wondering.


----

One thing thats annoying me lately (and it's not your fault of course), is
trying to find out what exactly I'm backing up when the daemon runs. I
tried the option in the config file that should tell you what's going on,
but what I was really hoping for was a list of files (and their sizes)
that were being backed up.

The problem behind this, is that I see some pretty huge numbers for amount
of data backed up sometimes (and even for uploaded), and I feel that my
machine should have been *quiescent* before the large amount of data was
backed up.

I also just hit the wall with the amount of data on your store, I think
yesterday, and I don't think that I should be backing up enough stuff to
cause that. I'm not saying I think your program is wrong, I'm just trying
to figure out where the fat file(s) is/are so I can decide if I really
want it/them backed up. (The script to notify me of the problem also had
an error (Debian Testing), but I haven't had time to look at that yet.

I started looking around last night, and I seemed to have some rather
large files in my wife's email related folders, 'caught spam' (which only
I check) where she gets a couple of hundred spam a day, and the files
related to spamassassin. So I'm thinking they were the culprit, it may be
that the file gets some big changes which spook the delta algorithm.

I'm thinking of using *find* to try and hunt down the large files in the
directories, but, I guess even some stats in the syslog would be helpful,
IE number of files backed up, maybe even some size stats (smallest,
largest, average - for file size and delta size), so I could try and
decide if there were 1000 small files that were changed, or one huge file
that somehow was changed.

This sort of thing would be helpful for someone trying to manage their
data on a third party data store (as you are for me). I'd be particularly
interested in doing say trending analysis to find out what was happening,
and to look for ominous changes. Or even to predict when I'd hit the wall,
and need to *purchase* more space from the data store provider.

Part of my interest is also related to bandwidth usage, I'm on a ADSL
connection, but some of my friends who would be backing up to me would be
on dialup, being able to manage the bandwidth usage would be helpful,
knowing what sort of changes were happening would help that way (you could
try and have multiple backup sets or know *why* a huge upload happened).
Other possible options would be deferring large delta's until a set time,
while letting small changes through. Although that could be risky and
complicated.

Rick