[Box Backup] Irregular Expressions

Johan Allard boxbackup@fluffy.co.uk
Mon, 24 Sep 2007 06:48:27 +1000


On 24/09/2007, at 5:55 AM, Chris Wilson wrote:

> Hi all,
>
> Ben rightly pointed out that my assumptions Box Backup's support  
> for regular expressions were wrong. The current state is that we  
> use POSIX regular expressions where available (all POSIX Unix  
> platforms) and Perl-compatible regular expressions through PCRE on  
> Windows.
>
> This situation is not ideal for three reasons: anyone who uses or  
> supports both platforms has to remember that the regular expression  
> styles are different; we can't use the much more powerful Perl- 
> compatible regular expressions on Unix; and it's possible (but  
> unlikely) that someone with a weirdly broken Unix system might end  
> up with a Box Backup build that supports PCRE instead of POSIX,  
> confusing the hell out of us all.
>
> There are several options for how to fix this, none of them ideal,  
> and I wanted to see what opinions people (users, developers and  
> packagers) had about them.
>
> 1. Do nothing. The worst option (in my view) except that it takes  
> no time.
>
> 2. Use PCRE everywhere and add it as a dependency. People with  
> platforms with no native PCRE package, and embedded system  
> developers, might hate us for this (Solaris? AIX?) but you could  
> still build Box with no regular expression support at all.
>
> 3. Use PCRE everywhere and bundle it with Box Backup. Bloats the  
> source by about 1MB compressed, 3MB uncompressed.
>
> 4. Use PCRE in preference to POSIX regular expressions if  
> available. Some platforms would still only have POSIX regular  
> expressions, but I think they would be few and rare. We would  
> probably have to add a command-line switch to bbackupd to report  
> what kind of regular expression support it had been built with.  
> Still risks confusing everyone, but in most cases better than the  
> current situation IMHO, because most people would have the more  
> powerful PCRE support.
>
> 5. Provide a configure-time switch to choose between POSIX and PCRE  
> expressions. Packagers could then provide both versions of the  
> packages for platforms where it made sense to do so. Still risks  
> confusing users, but at least they would hopefully know which  
> version of the package they had installed, and we could also add  
> the command-line switch described in (4).
>
> I'm also considering whether it's really worth providing the option  
> to build Box Backup with NO regular expression support. The only  
> case where I think this might actually be useful is for embedded  
> systems. But, I don't know if anybody actually does this or plans  
> to do it. And it's more likely that such devices would run bbstored  
> rather than bbackupd, so we could choose only to build bbstored in  
> that case.
>
> Any other options? Opinions about any of these? Let us know.
>
> Cheers, Chris.

I cast my vote on #2.

Agree that #1 would be the worst option. #3 makes things complicated  
making sure that included dependencies are kept up to date with the  
original source (who says that PCRE would be the only one for the  
future if this route is chosen). #4 would be even worse than now  
because at least now it's deterministic exactly what happens. #5 a  
possible solution, but to complicated IMHO.

In any case if any option is chosen where both style regexp's could  
be in place, I think that a start time warning needs to be  
implemented - "bbackquery has detected PCRE style regular expressions  
in /etc/box/bbackupd.conf and bbackupquery is compiled with POSIX  
style regular expressions", so at least the user as a chance of  
knowing what's happened if things don't work as expected.

I vote for number #2 because I think it's better to keep things  
simple, and add the option to remove the regexp in that case for  
what's surely the minority of people who can't/won't get PCRE working  
on their platform.

Cheers

Johan