[Box Backup] Housekeeping not catching up...
Ben Summers
boxbackup@fluffy.co.uk
Tue, 12 Apr 2005 09:47:07 +0100
On 12 Apr 2005, at 02:16, Imran wrote:
> Hey Ben!
>
> I had a problem in january where housekeeping wasn't catching up with.
> I
> think the client kept connecting and interrupting the housekeeping
> thing.
> Same thing happened again.
>
> I turned off the bbackupd for the account in question (I think thats
> what I
> had done before) but I still had three other clients running. which
> hasn't
> helped. two days and the server hasn't cleaned up the accounts. I
> also ran
> check+fix on it just in case.
>
> But basically the client in question has a lot of updates & new files
> so it
> fills up and frequently needs cleaning. The other clients have a lot
> of files
> too. but not as frequent of updates.
The current implementation of housekeeping does not work for all usage
patterns. This is one of them which is sub-optimal.
The next version will have a re-written housekeeper which doesn't need
to scan everything before deleting. The move to reference counted
objects will make a big difference.
>
> Will one clients connection interrupt housekeeping for all clients?
> or will
> it only interfere with its own housekeeping?
A connection will only interrupt the housekeeping for it's own account.
All others will be unaffected. (you can see this from the logs)
> Anyway, when its over limit, the
> server shouldn't allow that specific client to interrupt itself. Also
> started
> getting write lock error:
>
> Apr 9 14:53:19 backup bbstored[9274]: Certificate CN: BACKUP-030303
> Apr 9 14:53:23 backup bbstored[9274]: Failed to get write lock (for
> Client ID
> 00030303)
> Apr 9 14:53:23 backup bbstored[9274]: in server child, exception
> Connection
> TLSReadFailed (Probably a network issue between client and server.)
> (7/34) --
> terminating child
You shouldn't see that.
>
> after this, client tried again a minute later:
>
> Apr 9 14:54:28 backup bbstored/hk[20596]: Housekeeping giving way to
> connection for account 0x00030303
> Apr 9 14:54:28 backup bbstored/hk[20596]: Account 0x00030303, removed
> 3405
> blocks (1685 files, 0 dirs) was interrupted
> Apr 9 14:55:03 backup bbstored[20595]: Incoming connection from
> 209.126.207.39 port 58171 (handling in child 9278)
> Apr 9 14:55:03 backup bbstored[9278]: Certificate CN: BACKUP-030303
> Apr 9 14:55:03 backup bbstored[9278]: Login: Client ID 00030303,
> Read/Write
> Apr 9 14:55:04 backup bbstored[9278]: Session finished
>
>
> And the server's houskeeping got interrupted. and the server told it
> that it
> doesn't have enough space so nothing was transferred. But the server
> has to
> restart housekeeping.
>
> You had mentioned storing additional meta-data on each file or
> directory,
> which may help to speed up housekeeping on large accounts. If that is
> not
> hard to implement, that would be cool.
No, it's not terribly hard. To get all the features that everyone
wants, I'm going to have to rewrite a small portion of the backup
engine. It'll fix all the complaints, and get things working for more
usage patterns than I originally designed the thing for.
>
> Also I think what triggered this specific backlog is that I resized
> several
> sets of images for a client. about 6 gigs of images were shrunk to 2
> or 3
> gigs. I guess the server had to mark about 6 gigs of images (which
> were
> anywhere from 300k to 3megs in size) as deleted. and then upload
> another two
> or three gigs of images (which were now about 70k to 400k in size).
> And while
> this happened, also another customer uploaded about one or two gigs of
> files
> to the server.
That is probably going to take a while to clean up.
>
> All this happened about a week ago. server couldn't catchup on its
> own for
> two days. I just stopped all the other clients, and I'll jsut wait a
> day or
> so for the housekeeping to catch up.
The only way at the moment, I'm afraid.
Ben