[Box Backup] Backuping inversely

Chris Wilson boxbackup@fluffy.co.uk
Sat, 26 Jan 2008 14:19:02 +0000 (GMT)

Hi Pablo,

On Sat, 26 Jan 2008, Pablo Fernandez wrote:
> On Fri, 2008-01-25 at 23:18 +0000, Chris Wilson wrote:
> > Why do you want to keep a schedule in the first place? Why not just let 
> > the clients stream backups at their own speed, when they want to? The 
> > server should be perfectly capable of handling that.
> This would allow you to set and manage backup policies from a 
> centralized place. Think bacula director. Think for example in the case 
> where backup is modeled as SaaS and clients are the one that are not 
> supposed to dictate their policy. Given a scenario like this where SLAs 
> are subject to contracts a central management or some sort of control is 
> quite important.

I'm not sure I understand the scenario here. If you're offering backup as 
a service to untrusted clients which are outside your control, then I can 
see that as a server operatir you might want to place some constraints on 
your clients' use of the service. For example, you might want to limit 
their disk space usage (already supported), their bandwidth use overall 
(bytes per day) and perhaps the times that they are allowed to log in, to 
manage your bandwidth on the server side.

The latter two are not directly supported at present, but it would be 
quite easy to lock accounts on the server side outside certain times, to 
prevent backups from being initiated outside those times, and also to 
review (parse) your server logs to identify bandwidth used so far that day 
by each client, and lock their account if they go over quota.

Then you could tell clients exactly when they should start their backups, 
either a fixed time (cron job) or a web service which the SyncAllowScript 
connects to. However, I still think that it's unnecessary and perhaps 
unwanted for the server to tell the clients when to back up. Imagine if 
Google Docs would tell you to open a certain document at a certain time, 
whether you actually wanted to or not :-)

We could also add more support for tracking bandwidth used per client, for 
example regular reports by bbstored during the backup to syslog (not just 
at the end of the connection), and/or putting these counters in the 
account info on the store, where they could be manually reset or tracked 
according to some policy that you specify and control on the server. These 
would not be hard to implement.

On the other hand, before you talked about a cluster of computers, which 
presumably are entirely under your control. In that case, you can schedule 
the backups to happen exactly when you want, either by pushing out a 
crontab or by using a SyncAllowScript as described above.

Remember that it may not always be possible for the server to contact the 
client to initiate a backup, for example if the client is behind a
firewall or NAT and the client's network administertor does not want 
to open the required inbound ports. Client-driven backups can still run in 
that scenario.

I'm not saying that your original idea/request is "wrong" but that there 
may be simpler or more effective ways to achieve the same thing with fewer 
changes to Box Backup, which currently works quite well for its intended 
target audience. It sounds like you're trying to use Box Backup for 
something that it was not designed to do, and complaining that it's nto 
quite right for this new purpose. That does not surprise me :-)

> I think this would add to the scalability of the project. I've been 
> looking into a rather large amount of solutions including leaders like 
> Amanda and Bacula and Box Backup is definitely on a great path, I just 
> think it lacks a few things very important for large networks where 
> maintainability is a big issue.

On the contrary, I think that a single, large, centrally managed system 
has as big or bigger problems with scalability (witness the Soviet Union).

Cheers, Chris.
_____ __     _
\  __/ / ,__(_)_  | Chris Wilson <0000 at qwirx.com> - Cambs UK |
/ (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Ruby/Perl/SQL Developer |
\ _/_/_/_//_/___/ | We are GNU : free your mind & your software |