RFC: end-to-end compare -aq (Was: Re: [Box Backup] Win32 native client service bbackupd.conf)

Thu, 29 Jun 2006 14:16:23 -0700 (PDT)

Peter,

> BTW, are the attributes the only thing not compared with -aq?

No, as far as I understand, bbackupquery -aq compares content (data) block (chunk) checksums
stored on the server side with the content block checksums generated on the client. bbackupquery
will examine each local file, generate content block checksums and compare that information with
block checksums that are downloaded from the server. Attributes are checked as a part of this
process, as well.

> works?  How does it do its job without downloading all of the
> ciphertext from the server and decrypting on the client for comparison

bbackupd client, during each backup, gives the server not only content blocks, but also content
checksums (actually, plaintext checksums). Server stores the information remotely, so it can
"know" whether it has the same content as the client, without looking at actual plaintext content.

> 1.  Help me understand what is the goal of this end-to-end compare RFC
> that is not satisfied with: 
> bbackupquery compare -aq

Well, the assumption made by the server is that whatever block is on its storage media actually
>>matches<< a content checksum stored along with it. It is, theoretically, possible for
server-side content block to get corrupted (hard drive sector corruption, filesystem mess-up,
memory corruption after  stream receive from a client, deliberate content spoofing attack,
meltdown of server block-writing logic, etc. to name a few), but compare -aq would never detect
this, since it does not look at the factual content blocks it has, but only at checksums it has
received from a client. That's where compare -a comes in - it actually downloads all content
blocks from the server and runs a comparison with live client content.

> 3.  One of the tenets of Box Backup is that we only trust the server to
> return the stored ciphertext when requested by the client.  Does this
> RFC encroach on that?

No, everything stays the same, but an additional check is performed: the server compares not only
content block checksums (with whatever the client has), but also compares factual storage media
content (what's on the hard drive) with what SHOULD be there. In case of, say, hard drive sector
corruption, server content would not match [an additional ciphertext content checksum] and compare
-aq would be completely reliable to a very high statistical probability.

> (4.  Can the ssl layer give us any useful checksums?)

Yes, checksums could be calculated on the server side, but it is somewhat better to do this at the
client side and send it with content blocks - we could then detect any errors from the moment the
content leaves the client, down to the moment it is written to storage media on the server.

> Sorry if these are dumb questions.

Not at all. I hope my understanding of the situation is correct (I'm famous for messing up my
logic from time to time 8)) - Ben?

Gary

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com