[Box Backup] 100% CPU on big directories

Ben Summers boxbackup@fluffy.co.uk
Tue, 14 Feb 2006 19:51:28 +0000


On 14 Feb 2006, at 12:00, Jamie Neil wrote:

> Ben Summers wrote:
>>> Yes, I know it's not clever to have a flat directory with 150,000  
>>> files
>>> in it :), but I've noticed the effect to a smaller degree on  
>>> directories
>>> with only 1000 files. Does anyone know if it's fixable?
>>
>> I can think of a couple of places in the code which wouldn't  
>> appreciate
>> this. It might be possible to get a speed-up by using a hash_map  
>> instead
>> of a map in critical places.
>
> I removed a large amount of the files in the offending directories  
> which
> brought the number of files in each directory down to about 30,000.  
> The
> utilisation still hit 100%, but it did complete this time (took  
> about 8
> hours to back up 446MB).

Sounds like that's the problem then.


>
>> I've added a note to the wiki about this to make sure we think  
>> about it
>> in the next round of changes.
>>
>> Sorry not to be more helpful now.
>
> That's ok - the problem is avoidable in most situations. It might
> present a problem where you can't control what people are backing  
> up though.

We're planning to redesign the way files are selected for backup.  
Making it more efficient for the nasty edge cases is on the list.

Ben