[Box Backup] inode, fsid, and OS dependencies
Ben Summers
boxbackup@fluffy.co.uk
Wed, 15 Dec 2004 20:00:51 +0000
On 14 Dec 2004, at 16:44, Joe Krahn wrote:
> While looking into info about inodes and filesystem IDs, I found some
> compatibility issues, even for POSIX systems. It also made me think
> about OS dependencies in general. So, here's some info/ideas:
>
> When native Win32 support is merged, OS dependency issues will
> increase. If it were me, I would try to keep the code clean by
> consolidating OS-dependent code. So, instead of #ifdef
> WIN32/LINUX/etc., you just make a set of generic functions, then
> include source files for these depending on the OS, such as
> BoxWindows.cpp, BoxPosix.cpp, etc.
The way Nick has done it is to implement just enough of the POSIX API
for it to run. I think this is probably the most effective approach.
Most platforms are POSIX these days anyway, and Win32 isn't too much of
a problem.
>
> If it were me, I would use plain C for the OS functions. An
> alternative C++ approach would be an OperatingSystem class. It might
> be nice to have a POSIX class, then make subclass variants for
> BSD/Linux/etc.
>
> As for filesystem ID (fsid) and inodes, the purpose, I think, is to
> use (fsid,inode) as a globally unique file identifier that does not
> change when renamed. Technically, struct statfs.f_fsid is the unique
> filesystem ID, but this field is not well supported. Struct
> stat.st_dev is well supported, but could change if a disk is added or
> removed. So, the mountpoint is a reasonable compromise.
This is why I need the name for the tracking, not the filesystem
number. I can't trust it enough as it's not defined well enough.
> But, I would put this in the "OS-dependent" category. It may be
> better to use a filesystem label or GUID/UUID instead, depending on
> the OS.
>
> Inodes are a further problem: Windows doesn't support inodes. (Maybe
> the Longhorn FS will.) It also turns out that inode is not always
> unique in a POSIX system: some file systems have a larger actual
> inode number, and only return a hash of the real inode in stat.st_dev
> (i.e. Coda uses 128 bit inodes). So, inode is also an OS dependent
> thing.
All that is required is a unique identifier for a file on a disc. Nick
has pointed out that such a thing exists on Win32, and for our
purposes, even a fudged hash will do. The code which uses it is
purposefully tolerant of collisions.
>
> It might be good to have the inode field variable sized. Is that the
> case now? I haven't looked at the way data is stored on the server
> end.
The inode number is only used on the client, and can be whatever size
the port requires.
Ben