[Box Backup] BoxBackup Server Side Management Specs (Draft 0.01)

Per Thomsen boxbackup@fluffy.co.uk
Tue, 21 Sep 2004 22:27:39 -0700


All,
Here are the things I have come up with from a user's perspective as
requirements/specifications for server-side management.

This is very rough and not necessarily thought through at all levels,
but I thought I'd get it out there, so everyone could get a chance to
see it, and shoot down / change / add to what's in here.

1. Symbolic Names for hosts.
In addition to using a hex number to identify backup clients, a mapping
must be put in place between the account number, and a name the user
(backup admin) determines. This name can be up to 255 characters long
(RFC 1034).

For backwards compatibility, account numbers must still be accessible,
and usable just as before. If needed command line switches can be
applied to denote one or the other. Additionally, for management
purposes it should be possible to explicitly set the account number
during the creation of a new client (as well as the symbolic name).

While the name can be arbitrary, the installation process should attempt
to determine the full domain name of the host, and use that value as the
default.

2. Client groups.
To support the distinction of groups of boxes being backed up, the
concept of groups of users should be implemented. Groups are collection
of clients or other groups (with appropriate circular-dependency checking).

The 'bbstoreaccounts' executable will add functionality to manage
groups, including getting statistics from a group. The statistics will
be the cumulative values of the same data as for a single client, with
the addition of the following:

- List of group members
- ?

3. Client Monitoring.
The client should be able to send 'heart beat' messages to the bbstored
server.

Configuration information for heart beat is kept on the bbstored server.
It includes:

For each client:
- on/off switch. Clients can be monitored, but are not required to be.
This is especially useful for mobile users, who are not connected to the
internet all the time.
- Heart beat interval. How often the client sends heart beat
information. Given in seconds. Default is 900.

Heart beat messages will not interfere with long-running syncs or
restores (large files), but will insert itself as close to the interval
as possible, to ensure that as few false error alerts as possible are
sent to administrators.

When bbackupd starts, it will register with the bbstored server, and
request its configuration information. It will use this to send the
messages at the appointed times.

Also, bbstored will create a record of the now running bbackupd (in
memory, mmap, or whatever works best), to hold the data for the
statistics, as well as to ensure that only clients that have registered
are being monitored.

When a bbackupd daemon completes an orderly shutdown, it will
'de-register' itself from the bbstored service, to ensure that no false
'down alerts' go out. However, if bbackupd dies as a result of some
failure, the record on the bbstored server will remain, and eventually
cause alerts to go out to the backup administrator.

The heart beat packet contains the following information:

- Host identifier (name and/or account number)
- uptime (ie. how long has bbackupd been running on this host)
- timestamp of last sync (when was the last file uploaded)
- Number of bytes synced since last heart beat message
- Number of bytes restored since last heart beat message
- any significant errors that have occured since last heart beat.
- ?

On the server side, a daemon (most likely bbstored) receives these heart
beat messages, and keeps track of the status of all clients. It will
keep a running counter of the byte-count statistics for the client, as
well as a log of the significant errors. bbstore

When a bbackup client daemon dies unexpectedly, the bbstored server will
notice that there is no heart beat message from the client after
approximately 2 x the heart beat interval. It will then notify backup
administrators using the NotifySysAdmin.sh mechanism, or one very much
like it.

When a significant error occurs, and is logged with the server, a
similar notification mechanism will be used to notify the backup
administrator.

4. Space use reporting.
Reporting of space consumption is needed at several levels:

- The entire bbstored server (all RAID volumes being used for backups).
- Each Volume. Ensuring that one single volume isn't bearing the brunt
of the load, as well as for planning purposes.
- By Group. This relates to item #2 in my list. It has very similar
reporting requirements to the individual client, with the same additions
as described in the Group section.
- By Individual. This is already available in the current version.

5. Account Database.
The ability to store the client account information in a database is
crucial to the stability and scalability of the system. Change the use
of text files to using a database.

6. Interaction with the rest of the world.
Interfacing in an easy way to other systems for Monitoring and reporting
purposes. Rather than nicely formatted output there should be an option
in all commands to format the output for human and for script
consumption. This data could then be used by products like Nagios
(www.nagios.org)

As I said this is a very rough draft. So, go crazy: edit, change, add,
whatever.

Thanks,
Per

-- 
Per Reedtz Thomsen | The Reedtz Corporation | F: 209 883 4119
V: 209 883 4102    |   pthomsen@reedtz.com  | C: 415 425 4025
GPG ID: 1209784F   |  Yahoo! Chat: pthomsen | AIM: pthomsen