[Box Backup] Re: Block Sizes and Diffing (was: Re: [Box Backup] error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry)

Mon, 10 Sep 2007 16:09:25 +0200

Hi Chris!

Am Freitag, den 07.09.2007, 21:09 +0100 schrieb Chris Wilson:
> Hi Johann,
> [...]
>
> > Why is bbackupd reading this file so often?
> 
> That is a very good question. If the reads were powers of two, then I'd 
> suggest it was due to diffing at different block sizes, but that doesn't 
> seem to be the case here. Perhaps this is why your backups are taking 
> so long? I'll investigate.
> 
> I assume that you're running 0.10 on the client?

Yes, I do. The client and the server are the same machine and they use
the inofficial Debian packages from http://debian.myreseau.org/.

Am Sonntag, den 09.09.2007, 20:59 +0100 schrieb Chris Wilson:
> [...]
> By the way, did adding the KeepAliveTime option fix this timeout for you?

Yes it works, we have no timeouts any more. Thanks for the hint.

> > Why is bbackupd reading this file so often?
> 
> I have a theory that I'd appreciate your help in testing out. It seems 
> that Box Backup doesn't use a single block size for the entire file. 
> Rather, if it detects a chunk that's been changed or added, it can use a 
> different block size specifically for that chunk. If this happens a lot, 
> then the file can end up with a wide mixture of block sizes.
> 
> For every block size that we want to use in diffing, we have to calculate 
> the checksums of every block with that size across the entire file. Since 
> we don't want to hold the entire file in memory, we read it repeatedly 
> (there is probably a more efficient way to do this by calculating all the 
> checksums while reading the file just once).
> 
> Box Backup will pick the most frequently used block sizes in the file, up 
> to 64 (!) of them, when deciding what sizes to calculate the checksums 
> for. That means it will read the entire file up to 64 times.
> 
> There is clearly a tradeoff between bandwidth efficiency and local system 
> efficiency with this principle, where the scale factor is particularly 
> high with this algorithm.
> 
> It should be slightly better in recent development versions, since I added 
> a buffering stream wrapper that always reads 4k chunks to reduce the 
> number of system calls, but it could definitely be better.
> 
> Anyway, you can check at least part of the theory as follows: download the 
> latest version of that file as an encrypted object from the store by using 
> bbackupquery's "getobject" command; run "bbackupobjdump" on it like this:
> 
>    bbackupobjdump bigfile.2.obj | sort -k4 | uniq -f3 -c | grep s= \
>    | sort -r -g
> 
> and the output will look like this:
> 
>     1280        0 this  s=    4129
>        1     1280 this  s=     545
> 
> The last number of each line is the block size, and the first is the 
> number of times that it occurs. They will be sorted in descending order. I 
> expect that you will get a lot of lines. They won't correspond exactly to 
> the read sizes you're seeing, because these are compressend, encrypted 
> blocks and the reads are for uncompressed, unencrypted blocks on the 
> server, but the number of different sizes that you see should correspond 
> (roughly) to the number of times the file is read (up to 64).

The output consists of 1216 lines and starts with:
    642     1069 this  s=      49
    407      107 this  s=     177
    336     1065 this  s=    1633
    296     1363 this  s=    1649
    277     1159 this  s=    1617
    265     1424 this  s=    1681
    254     2706 this  s=    1601
    248     1011 this  s=    1665
    248     1005 this  s=      33
    246     1015 this  s=    1745
    219     1027 this  s=    1729
    205     1226 this  s=    1697
    200     1006 this  s=    1713
    194     1012 this  s=    1585
    156     1303 this  s=    1569
    155     1014 this  s=    1761
    141     1016 this  s=    2017
    114     1020 this  s=    2033
    111     1246 this  s=    1553
    103     1025 this  s=    1777
(and all following are <100 for the first column)

> I can see three options, in order of increasing work and effectiveness:
> 
> * accept that Box Backup is designed for bandwidth efficiency, not local
>    file efficiency, and don't change anything;
> 
> * make the number of block sizes checked a tunable parameter, possibly
>    changing the default from 64 to something else;
> 
> * change the patching algorithm to try to use an existing block size from
>    the file if it find a close match to the "optimal" one;
> 
> * change the checksum algorithm to calculate all block size checksums in a
>    single pass.
> 
> The last two options are the ones that I feel least confident about, and 
> I'd appreciate some help or at least sanity checking of a patch from Ben, 
> Martin or Chromi (who wrote the current diffing algorithm if I understand 
> correctly).
> 
> On the other hand, you might find that the file doesn't have a range of 
> different block sizes, and so I need to go back to the drawing board for 
> another theory.

Unfortunately I don't understand BoxBackup's diffing and block-size
algorithms, so I don't know what to conclude from my above listing. :-)

Do I understand correctly, that BoxBackup tries to find the smallest
possible block size to transmit (and store) changes?

Again, regarding block sizes: From the strace I've sent last friday, we
can see that it always starts at position 0 and starts with reading
blocks of the very same size. So, the possible blocks always have to be
aligned to a position in the file which is a multiple of their block
size.

OTOH I have a suggestion for the block algorithm. The block size can be
defined to a fixed size, e.g. 4k and you can accept that up to 4095
bytes might be duplicate. In my opinion wasting a few kB is tolerable.

There is still a problem: insertions or deletions in the file can't be
identified this way. Imagine a single byte insertion at the very
beginning of the file. Then every 4k-aligned block will have changed and
the whole file needs to be updated. This problem has already been
addressed by rsync and is described at
http://rsync.samba.org/tech_report/ ("The rsync algorithm" and "Rolling
Checksum"). 

BTW: The (inofficial) BoxBackup Debian package from
http://debian.myreseau.org/ doesn't contain the bbackupobjdump tool, so
I checked out the trunk directory from the SVN. It doesn't build
bbackupobjdump automatically, so I had to trick a little bit. How can I
do this "beautifully"?

Bye
  Hansi

-- 
Johann Glaser                          <glaser@ict.tuwien.ac.at>
             Institute of Computer Technology, E384
Vienna University of Technology, Gusshausstr. 27-29, A-1040 Wien
Phone: ++43/1/58801-38444                Fax: ++43/1/58801-38499