[Box Backup] child bbstored process ramping up to 100% cpu during bbackupd sync?

Ben Summers boxbackup@fluffy.co.uk
Tue, 8 Feb 2005 09:50:09 +0000


On 1 Feb 2005, at 19:21, reticent wrote:

> Ben Summers wrote:
>
>>
>> On 31 Jan 2005, at 21:19, reticent wrote:
>>
>>> Ben Summers wrote:
>>>
>>>>
>>>> On 28 Jan 2005, at 22:45, reticent wrote:
>>>>
>>>>
>>>> What you are seeing is a patch being received by the server, and 
>>>> applied. The 20-30kb of data is the patch, then there's a bit of 
>>>> computation required to apply the patch to the existing file, which 
>>>> overwrites the original file.
>>>>
>>>> What is the usage pattern of files on the client? Are they static? 
>>>> Are they being constantly modified? Are files being added 
>>>> regularly?
>>>>
>>> The files are, for the most part, static. There are only a few 
>>> hundred updates/additions per day.
>>> They are also very small, ~1-20kB
>>> No modifications are made to them
>>
>>
>> What's the directory structure like?
>>
>
> there are 70 directories with 5000 subdirectories each

Directories with many entries will not be especially efficient, I'm 
afraid.


>
>>>
>>>
>>> The quoted speed was during a backup, i'll have to take some time to 
>>> do some debugging as somthing is definitly not right here.
>>>
>>> I'm getting the feeling that it might be related to the amount of 
>>> directories that are being delt with, the backup completed (30+ 
>>> hours later for 1.6 gigs..) and i'm seeing the hk process run and 
>>> eat up 100% cpu when checking backup data for host mentioned in the 
>>> original email.
>>
>>
>> Yes, housekeeping can take a while for large directory structures. I 
>> will be addressing this soon.
>>
>>>
>>> Looking at the lsof output for the file, it seems to spend approx 6 
>>> minutes opening/closing a single file (that is exactly 234k, for 
>>> some reason all the files i've noticed it working with are exactly 
>>> 234k)
>>
>>
>> Interesting. It's probably the directory file you see being 
>> re-written repeatedly.
>
> The strange thing is the system eats up 100% CPU both during sync 
> (seems like its writing the temporary file over and over again)

The "temporary file" is the directory record. This is modified every 
time the file is written, to ensure the integrity of the backup store.

I suppose it could be written lazily, and use the bbstoreaccounts 
utility to fix things when it goes wrong, but I'm not entirely sure I 
like this idea.

>
> And also when running the housekeeping process, currently the 
> housekeeping process has been running for 3 days and has consumed 4198 
> CPU minutes.

Too many files -- I will make the process restartable in the next 
version.


On 1 Feb 2005, at 23:57, reticent wrote:

> Ok, i've taken a copy of the entire dataset from the system that i was 
> talking about previously and copied it over to my workstation.
>
> Linux2.6, 2ghz p4 from a linux2.4, 500mhz dual p3
>
> Both the server and the client are running on the same machine
>
> Dataset is 346M in size (on reiserfs, 2.0G on ext3)
> Contains ~350,000 directories and ~8,000 files of 10-60kB
>
>
> Transfer rate is roughly 30-70kB/sec
> Jumping to 200-400kB/sec sporadically

The backup process is not written for speed, as it is assumed it will 
be running over a slower network connection. However, large files 
should be very fast to upload.

>
>
> After three hours, it is roughly 40% complete its sync.
>
> 5 minutes into the backup, CPU usage was at 50-90% by the bbstored 
> process
>
> Original Drive usage of dataset on a reiser3 fs is 346M
> Drive useage of dataset on reiser3 fs after bbackup sync is 1.6G 
> (stored by bbstored)
> (For reference, the same dataset on an ext3 partition takes up 2G)

There is a certain amount of overhead for directories and other 
meta-data. This does seem a lot though. Are you using it in RAID mode?

>
>
> So it seems that the amount of directories is (understandably) causing 
> the huge load, making it impossible to use Boxbackup on this dataset.
>
> The amount of size it occupies is a bit surprising, but i know little 
> of the mechanics of Boxbackup so these are only observations.

It does sound like Box Backup is not the solution for this data set. At 
least, not this version.

Ben