[Box Backup] Client blocked by stale lockfile?

Chris Wilson boxbackup@fluffy.co.uk
Mon, 28 Feb 2005 21:42:29 +0000 (GMT)


Hi all,

Sorry if this is a stupid question. I discovered that my boxbackup client 
was not making backups, but instead spinning on what looks like a 
lockfile:

chris@gcc(boxbackup)$ strace -f ../boxbackup/release/bin/bbackupd/bbackupd
 	../test/config/bbackupd.conf SINGLEPROCESS
[...]
stat64("../test/clientdata/__db.mnt_.n", {st_mode=S_IFREG|0640, st_size=0,
 	...}) = 0
open("../test/clientdata/__db.mnt_.n", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE,
 	0640) = -1 EEXIST (File exists)
open("../test/clientdata/__db.mnt_.n", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE,
 	0640) = -1 EEXIST (File exists)
open("../test/clientdata/__db.mnt_.n", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE,
 	0640) = -1 EEXIST (File exists)
stat64("../test/clientdata/mnt_.n", 0xbffdcbf0) = -1 ENOENT (No such file
 	or directory)
[... ad infinitum ...]

I checked out that directory, and found what looks like the remains of a 
dead client:

chris@gcc(boxbackup)$ ls -la ../test/clientdata/
total 8
drwxrwxr-x  2 chris chris 4096 Feb 20 10:53 .
drwxrwxr-x  8 chris chris 4096 Feb 28 21:28 ..
-rw-r-----  1 chris chris    0 Feb 20 10:53 __db.mnt_.n
-rw-------  1 chris chris    0 Feb 20 10:53 last_sync_finish
-rw-------  1 chris chris    0 Feb 28 21:28 last_sync_start

Since I enabled ExtendedLogging, I haven't been able to reproduce the 
spinning, but Box still does not clear the stale lock and start backing 
up again. This could be a nasty surprise for someone one day.

What do you think about implementing a pid-based lockfile scheme, or a 
kernel-based scheme, where stale locks can be detected and removed 
automatically?

Cheers, Chris.
-- 
_ ___ __     _
  / __/ / ,__(_)_  | Chris Wilson <0000 at qwirx.com> - Cambs UK |
/ (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Perl/SQL/HTML Developer |
\ _/_/_/_//_/___/ | We are GNU-free your mind-and your software |