[Box Backup] monitoring bbackupd

dm boxbackup@fluffy.co.uk
Sat, 27 Aug 2005 18:44:33 -0500


Alex Howansky wrote:

>I've got BoxBackup clients running on about a dozen servers and I'm finding
>that bbackupd dies every once in a while on one or another of them. While I
>suppose that in itself is a bug, what I'm really looking for is a way to be
>able to tell (from a remote server) if that has happened. I've got an existing
>centralized monitoring system and I was hoping to avoid installing some other
>local monitoring system on each machine. Ideally, I'd like to be able to do a
>remote connect to bbackupd over a TCP/IP port, but seems that bbackupd uses
>sockets only. Any ideas?
>
>I suppose alternatively, that I could use daemontools to keep bbackupd running.
>Does anybody know offhand if fghack works ok in this case? Or is there maybe an
>undocumented switch on bbackupd to prevent it from backgrounding?
>
>Thanks,
>
>  
>
Just one suggestion..  You could enable SNMP and query the following 
community string.  This seems to work with both windows and linux SNMP 
daemons(of course, they have to be configured to allow this information 
to be sent).  If you are on a closed network, then this would probably 
be fine, but on the internet you'd probably want to tunnel through a 
secure connection.

HOST-RESOURCES-MIB::hrSWRunName = OID .1.3.6.1.2.1.25.4.2.1.2.$PID

snmpwalk -c community <hostname> hrSWRunName | egrep "bbackupd|bbstored"
or (if you don't have MIB definitions installed)
snmpwalk -c community <hostname> .1.3.6.1.2.1.25.4.2.1.2 | egrep  
"bbackupd|bbstored"

Windows client output
HOST-RESOURCES-MIB::hrSWRunName.3100 = STRING: "bbackupd.exe"

boxbackup linux client output
HOST-RESOURCES-MIB::hrSWRunName.4639 = STRING: "bbackupd"

boxbackup linux server output
HOST-RESOURCES-MIB::hrSWRunName.1718 = STRING: "bbstored"
HOST-RESOURCES-MIB::hrSWRunName.1721 = STRING: "bbstored"
HOST-RESOURCES-MIB::hrSWRunName.31881 = STRING: "bbstored"

In the long run, it would probably be better to have an agent installed 
that attempts to restart things on its own, and if failure occurs, 
should also be capable of reporting back to a central location securely.

BTW, It might not be a bad idea to enable bbstored and bbackupd to send 
SNMP Traps to a centralized monitoring server based on specific 
configurable events?  This would make it more managable in larger 
configurations, and it could much more easily integrated into many 
monitoring solutions.

dm