[Box Backup-dev] Re: Learning from ZFS (fwd)
Chris Wilson
boxbackup-dev@fluffy.co.uk
Mon, 7 May 2007 19:42:39 +0100 (BST)
Hi Stuart,
> That's very interesting. I'd read about ZFS previously and seen it had
> many of the features that we rely on BB to provide at the moment.
>
> What I'd be most interested in is some roadmap process that would get us
> here, with some (I suspect) FAQs about that process.
Right now might be a bit early for a roadmap, given that we don't know
which road we're taking :-) but I think it's a good idea and we should
produce one (and a FAQ).
> In particular:
>
> 1. Will existing stores be portable to the new store format? I'm pretty
> certain that it's 'no', but that would be a significant issue for existing
> users who have a large amount of legacy remote data and slow data links (eg
> the internet). Perhaps it would at least be possible to convert an existing
> store in a one-way process?
I think the answer would be "No" as well. If we shift to a ZFS filesystem
on the server, we could provide an upgrade path, something like creating a
new account (with a new certificate) and a client program to copy the old
accounts contents to the new one. That would probably not be too hard to
test and be confident about.
Sharing both data formats in the same store, or upgrading on the server
without the client's keys, would probably be much harder to test and to
get right.
> 2. What clients and server OS's would be supported? I see there's
> work-in-progress for ZFS over FUSE for Linux, but would this be
> supported for both client and server, and what about other OS's?
I'd like to support a cross-platform solution. My proposal would involve
replacing Box Backup's filesystem code with the ZFS code, running in user
space and tightly integrated, so that no kernel support would be required,
nor ZFS. This is not ideal in many ways, but it is portable and doesn't
require the client to change all their filesystems to ZFS.
> 3. The current network protocol, whilst it has had its share of problems
> with things like timeouts and large files, does work quite well over
> unreliable links such as the internet.
>
> Would a reliance on ZFS mounted remotely over such links cause problems
> with reliability and recovery when those links drop connections? Perhaps
> there are no problems, but I've tried tunnelling network filesystems
> before and have had such issues (principally, /cringe/ when I was trying
> to do that with Samba).
That's a good point, and very important for us to investigate. It would be
particularly interesting (for me) to look at the assumptions that ZFS
makes about the underlying block devices, and see how accurately we can
preserve and guarantee those semantics over an unreliable network.
> 4. Is this independent on any work to release 0.11 in the near future?
Yes, this is definitely independent, it evolved from some discussions we
were having about what 0.20 should be. It will definitely not affect the
upcoming release of 0.11 :-)
> What is the driver for such a significant departure, I wonder? Whilst,
> of course, the development is going to go in whatever direction the
> developers would like, why is the current architecture considered in
> need of replacement?
The store management of the current system is not particularly good. In
particular, we don't have point-in-time filesets which would allow
point-in-time store browsing and restore; the server manages the client's
old and deleted versions of files in a rather cumbersome and intrusive way
(imho); housekeeping is too slow; we want to deprecate and replace
raidfile.
These would all be solved by using a good, reliable, robust versioning
filesystem over a robust network protocol. But such filesystems are
difficult to find. Box implements one, with a lot of code, but it's not
ideal. ZFS implements one too, possibly better, but with even more code!
ZFS is an option that looks particularly interesting at the moment, which
prompted our discussion.
> There has now been a lot of testing and usage of BB over a number of
> years and that would be effectively cancelled for a new architecture
> based on ZFS
Not all of it. We still have good tests that we will continue to run. And
ZFS has hopefully had a lot of testing as well, probably more than Box.
> Don't get me wrong - I'm not against such a departure at all and it
> would give me a chance to play with some new toys, but I'm interested in
> whether there's a reason other than that it will be a nice change for us
> all.
I think so, i.e. it brings useful benefits and simplifies
>
> Stuart
>
> Chris Wilson wrote:
>> Hi all,
>>
>> Ben and I have been discussing how we could possibly use Sun's Solaris ZFS
>> to provide most of the functionality of Box Backup. We could get the rest
>> with a couple of layers above and below ZFS, which would make Box much
>> simpler and give us cool features like versioning and snapshots.
>>
>> We think we should move this to the -dev list so that we can all discuss
>> it.
>>
>> We're discussing two possible approaches, not necessarily incompatible or
>> either-or, just talking points:
>>
>> Option 1. Any filesystem on client, sync to encrypted filesystem (like
>> encfs) on top of zfs on top of iscsi block device (which is mounted from
>> the server, over the net).
>>
>> Advantages: supports any filesystem on client. Very simple server
>> (effectively a network block device) which holds ZFS images which are
>> managed entirely by the client. The server could in principle mount these
>> images locally, but would see only encrypted filenames and data.
>>
>> Disadvantages: more complex and less efficient than 2.
>>
>> Option 2. Client runs encfs on top of local zfs, synchronises this to
>> remote ZFS periodically, as Ben describes below.
>>
>> Advantages: more efficient (we think), simpler Box code (encrypted
>> filesystem only)
>>
>> Disadvantages: requires kernel support, especially to run ZFS in the kernel
>> in most cases. requires client to run zfs filesystems.
>>
>> The message below is part of our discussion. If anyone is lost or needs
>> more introduction to the issue, please ask. I'll reply to Ben's email
>> below, shortly.
>>
>> Cheers, Chris.
> _______________________________________________
> Boxbackup-dev mailing list
> Boxbackup-dev@fluffy.co.uk
> http://lists.warhead.org.uk/mailman/listinfo/boxbackup-dev
>
>
> !DSPAM:463f24b4176691484613203!
>
--
_ ___ __ _
/ __/ / ,__(_)_ | Chris Wilson <0000 at qwirx.com> - Cambs UK |
/ (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Perl/SQL/HTML Developer |
\ _/_/_/_//_/___/ | We are GNU-free your mind-and your software |