RFC: end-to-end compare -aq (Was: Re: [Box Backup] Win32 native client service bbackupd.conf)

Gary boxbackup@fluffy.co.uk
Thu, 29 Jun 2006 05:19:57 -0700 (PDT)


>> "Allow for end-to-end server data verification with bbackupquery  
>> compare -aq (quick compare),
>> without having to re-download all data (as is the case with  
>> bbackupquery compare -a). The feature
>> would require bbstored (server) to read factual data blocks off its  
>> storage media and compare
>> those data blocks against checksums both stored locally and sent up  
>> from bbackupquery (client).
>> The only undetected failure condition in such a scenario would  
>> require both a server data block
>> and a server checksum getting corrupted in such a way as to match a  
>> client checksum - the chances
>> of this are beyond astronomical."

> The stored files are split into blocks, and each block is encrypted  
> using a strong random initialisation vector (IV). This is to maintain  

Ahhhhh, didn't think of this. That explains why a block checksum relates to a plaintext, not to a
ciphertext.

> Another thing to remember is that the server will reassemble blocks  
> if the client sends "changes only" for a file. So you can't just  
> record the checksum of the last file you sent.

So the only way out of this would be for a client to include an additional per-block ciphertext
checksum before the a data block is sent up to the server (to account for the possibility that the
network-streamed data is not saved properly on the server side in the first place). The server
would have to keep that information in the repository (that implies a verification-enabled
repository would not be backward-compatible) and use it for a two stage compare: a plaintext
checksum to verify client/server content match, and a ciphertext checksum to verify server/storage
media match, right?

[... hmmmm, wonder if yet another checksum of a plaintext checksum/a ciphertext checksum
combination would be useful here...]

The only way for this scenario to fail would be a "matching" corruption of a ciphertext checksum
and a factual data block, unless data handed off to a server by a client is nonsense in the first
place (a meltdown of client/server function, logic errors), and a client cannot decrypt/reassemble
its own blocks at all, even if handed the original data.

> I didn't really consider it worth the effort.

Well, I am sure the paranoid among us, already paying big bucks for RAID arrays of Raptor drives,
would be delighted at this feature ;).

G.


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com