[Box Backup] Re: Mailbox backup is dangerous.

Ben Summers boxbackup@boxbackup.org
Sun, 30 Aug 2009 12:29:25 +0100


Achim wrote:
>
> In the current situation, and admin might do everything correct  
> (like in
> Tom's case), the user might do everything correct and even delete the
> occasional file (that's why we have backup, right), and still  
> important
> data gets deleted (like in Tom's case) by an automated and apparently
> not very well understood function.



Housekeeping chooses files to delete by going through each version of  
each file in the store, and for each version:

   * Adding that version to a sorted list.

   * Removing entries from the tail of that sorted list until the  
space recovered by deleting files in the list would be just over the  
amount of space housekeeping wants to recover.

When the scan is complete, housekeeping deletes versions, starting at  
the head of the list, until it's recovered enough space. This is a  
relatively memory efficient way of choosing the files to delete.

So, the key to understanding which files are deleted is how this list  
is ordered. Ordering is defined by a standard STL comparison function,  
and it's found here:

http://www.boxbackup.org/trac/browser/box/trunk/bin/bbstored/HousekeepStoreAccount.cpp
   on line 612.

It would be quite easy to adjust this comparison function to prefer  
not to remove the last version marked as "deleted", simply by ranking  
them at the bottom of this list. The actual implementation is left as  
an exercise for the reader, in the tradition of all the best textbooks.

As regarding snapshots, the idea of the "marks" referenced in the  
function would have been that implementation. When you wanted to take  
a snapshot, simply increment the mark number for FUTURE files.  
Housekeeping will then prefer not to delete the last file in the  
snapshot. To restore a snapshot, just use the latest files in that  
mark. Obviously snapshots could be implemented better, but that was  
the plan.

Regarding the subject of the thread, it could be written more  
generally as "running software you don't understanding is dangerous",  
and we should fix that lack of understanding in this particular case  
through better documentation.

Ben