[Box Backup] 100% CPU on big directories
Ben Summers
boxbackup@fluffy.co.uk
Tue, 14 Feb 2006 19:51:28 +0000
On 14 Feb 2006, at 12:00, Jamie Neil wrote:
> Ben Summers wrote:
>>> Yes, I know it's not clever to have a flat directory with 150,000
>>> files
>>> in it :), but I've noticed the effect to a smaller degree on
>>> directories
>>> with only 1000 files. Does anyone know if it's fixable?
>>
>> I can think of a couple of places in the code which wouldn't
>> appreciate
>> this. It might be possible to get a speed-up by using a hash_map
>> instead
>> of a map in critical places.
>
> I removed a large amount of the files in the offending directories
> which
> brought the number of files in each directory down to about 30,000.
> The
> utilisation still hit 100%, but it did complete this time (took
> about 8
> hours to back up 446MB).
Sounds like that's the problem then.
>
>> I've added a note to the wiki about this to make sure we think
>> about it
>> in the next round of changes.
>>
>> Sorry not to be more helpful now.
>
> That's ok - the problem is avoidable in most situations. It might
> present a problem where you can't control what people are backing
> up though.
We're planning to redesign the way files are selected for backup.
Making it more efficient for the nasty edge cases is on the list.
Ben