[Box Backup] Failed to get write lock

Brian Burton boxbackup@fluffy.co.uk
Tue, 01 May 2007 19:15:41 -0400


Browsing the source a bit it appears that this is actually not a communication issue but
rather an issue in the NamedLock::TryAndGetLock() member function.  At first I thought
that perhaps the lock file could not be created (file permissions or something) but I do
see a file that I'm guessing is the correct file.  It has a time stamp that seems to
correspond to the error message in the log.

# ls -l /backups/data/backup/00000042/write.lock
-rw-------  1 backups backups 0 May  1 19:00 /backups/data/backup/00000042/write.lock

As a sanity check I ran ps on the client and discovered two bbackupd processes running on
the client.  I killed them both (one refused to terminate so I had to insist) and started
a new process.  Now I see only one in the ps output and no lock file errors in the server
log.  So it appears I may have accidentally started the client twice.  Maybe the first
process had locked up while holding the lock.  There was nothing in the server log file to
indicate that it was sending or receiving any data.

This does lead to a feature suggestion.  Perhaps the bbackupd could have a lock file as
well and refuse to start if another process is already holding a lock?  That would have
saved a fair amount of confusion and log file growth for me at least.

In any case, thanks again for an excellent open source backup solution!

All the best,
++Brian