ZFS: what "the ultimate file system" really means for your desktop -- in plain English!

Ashton Mills
21 June 2007, 2:50 AM


So, Sun's ZFS file system has garnered publicity recently with the announcement of its inclusion in Mac OS X and, more recently, as a module for the Linux kernel. But if you don't readFilesystems Weekly, what is it and what does it mean for you?


Now I may just be showing my geek side a bit here, but file systems are awesome. Aside from the fact our machines would be nothing without them, the science behind them is frequently ingenious.

And ZFS (the Zettabyte File System) is no different. It has quite an extensive feature set just like its peers, but builds on this by adding a new layer of simplicity. According to the official site, ZFS key features are (my summary):

  • Pooled storage -- eliminates the concept of partitions and volumes, everything comes from the same pool regardless of the number of physical disks. Additionally, "combined I/O bandwidth of all devices in the pool is available to all file systems at all times."
  • All operations are copy-on-write transactions -- this is a tad beyond me, but the key phrase here -- no need to ever fsck (chkdsk, for you Windows people or Disk Utility repair for Mac heads) the file system.
  • Disk scrubbing -- live online checking of blocks against their checksums, and self-repair of any problems.
  • Pipelined I/O -- I'll let the website explain "The pipeline operates on I/O dependency graphs and provides scoreboarding, priority, deadline scheduling, out-of-order issue and I/O aggregation." Or in short, it handles loads well.
  • Snapshots -- A 'read-only point-in-time copy of a filesystem', with backups being driven using snapshots to allow, rather impressively, not just full imaging but incremental backups from multiple snapshots. Easy and space efficient.
  • Clones -- clones are described as 'writable copies of a snapshot' and could be used as "an extremely space-efficient way to store many copies of mostly-shared data" which sounds rather like clones referencing snapshots and storing only the data changed from the snapshot. On a network with imaged desktops for eg, this would mean having only one original snapshot image, and dozens of clones that contain only modified data, which would indeed be extremely space efficent compared to storing dozens of snapshots.
  • Built-in compression -- granted this is available for most other filesystems, usually as an addon, and it's good point raised that compression can not only save space, but I/O time.
  • Simple administration -- one of the biggest selling points. This is quite important because some of the above features, such as pooling, can indirectly be achieved now with other filesystems -- layering LVM, md RAID and Ext3 for example. However for each of these there is an entirely different set of commands and tools to setup and administer the devices -- what ZFS promises is that these layers can all be handled from the one toolset, and in a simpler manner, as this PDF tutorial demonstrates.

All up, as a geek, it's an exciting file system I'd love to play with -- currently however ZFS is part of Sun's Solaris, and under the CDDL (Common Development and Distribution License), which is actually based on the MPL (Mozilla Public License). As this is incompatible with the GPLv2, this means the code can't be ported to the Linux kernel. However, this has recently been satisfied by porting it across as a FUSE module but, being userspace, is slow though there hope this will improve. Looks like it's time to enable FUSE support in my kernel!

Of course, (in a few months time) you could also go for Mac OS X where, in Leopard, ZFS is already supported and there are rumours Apple may be preparing to adopt it as the default filesystem replacing the aging HFS+ in the future (but probably not in 10.5).

But then you'd have to use a Mac. Ewww.

(I jest I jest, put down your pitchforks Macites!)

ZFS in Leopard: A popular screenshot from a development releaseZFS in Leopard: A popular screenshot from a development release


Post your comment



Comments

RSS feed Email alert

Brendan:

I love the concept of ZFS. I noted that you said incompatible with GPLv2. I doubt that it will be compatible with GPLv3 also, and even if it was compatible, much of the kernel is released under GPLv2 specifically, as it is only an option to release it under GPLv2 or later. This means that if the kernel ever transitions to GPLv3, it will take quite a long time.

29 February 2008, 8:31 PM (5 years ago)report abuse Send to a friend reply

Replaced:

There`s a bit of misunderstanding here:
Not CDDL is the one incompatible with GPL, but GPL is incompatible with CDDL (and also with a lot of other licences), it`s not the same thing.

FreeBSD already has a ZFS port (it`s working read-write, but far from perfect), because their licence (and licence policy) is compatible with CDDL.

Maybe you could make the same things with current Linux technologies (LVM, XFS, SoftRAID, etc..) but it would be a pain to deploy and maintain compared to the straight and logical design of ZFS.

Cheers

29 February 2008, 8:44 PM (5 years ago)report abuse Send to a friend reply

Pedro M. Rosario Barbosa:

If the GPL is incompatible with the CDDL, then the CDDL is incompatible with the GPL, it *is* the same thing. The CDDL, however, is a free software license, and it has been accepted as such by the Free Software Foundation. In theory that does not prevent GNU/Linux operative systems from using it, what it does prevent is to add or combine code of the ZFS to the Linux Kernel, which is GPLv2. So, if Gentoo Linux decides to use ZFS, this distro can do it regardless of the license. What Gentoo or any GNU/Linux distro cannot do is to combine the code with GPLed code. Not all GNU components are GPLed (think of Xorg, for instance), and not all programs running on GNU/Linux operative systems are GPL compatible (think of Mozilla Public License, or proprietary licenses).

Also the fact that FreeBSD added ZFS to port is irrelevant, because there are different programs available under different (incompatible) licenses which can be installed in FreeBSD. Installation and availability in ports does not mean that all the code of the programs under different licenses are all mixed with FreeBSD's code (in fact, such an attempt would be impossible).

The real worry here is the use of ZFS if the ideas used in the software are patented.

29 February 2008, 8:44 PM (5 years ago)report abuse Send to a friend reply

Chris Samuel:

I've already done some comparisons between ZFS & NTFS-3G (another FUSE filesystem) showing that just because a filesystem is in userspace doesn't mean it has to perform badly.

It's worth keeping in mind that Ricardo hasn't started on tuning ZFS/FUSE for performance yet.

29 February 2008, 8:31 PM (5 years ago)report abuse Send to a friend reply

old_misery:

It was released under the CDDL specifically because it was incompatible with GPL2, apparently. Sun have also smothered it with software patents. (clicky)

It sound really interesting, but the politics and peevishness behind it (if that's what it is) is bit off-putting.

Would anyone switch operating systems for the sake of a filesystem?

29 February 2008, 8:31 PM (5 years ago)report abuse Send to a friend reply

Storage Admin:

I would switch to build an awesome file or backup server.

29 February 2008, 8:44 PM (5 years ago)report abuse Send to a friend reply

David C.:

I'm not a GPL expert, so forgive me if I sound like a complete idiot here.

Why does ZFS's license make it impossible to release a Linux version?

Clearly, it would have to be GPL'ed if it was compiled directly into the kernel, but there's no reason it has to be built this way. Linux (at least the RedHat distribution I use) also supports external kernel modules that are loaded at run time.

So why can't ZFS be compiled as such a module? Wouldn't that avoid any GPL problems?

Or has the latest GPL gone so far as to prevent even external kernel modules from having distribution restrictions?

29 February 2008, 8:31 PM (5 years ago)report abuse Send to a friend reply

ZBL:

As I understand it (IANAL)...

If ZFS were to be in the Linux kernel, and distributed as such, then the two would have to be distributed under a single licence--thus, there'd have to be a single license both previous licenses could "agree" on under which to be relicensed.

Since the CDDL is incompatible with the GPLv2, both allow for similar freedoms when it comes to modifying and redistributing code, but, since the two licenses can't agree on a single license under which to relicense, you can't combine their code.

There've been rumours about Sun releasing OpenSolaris (and ZFS) under the GPLv3--if that were the case, and Linus Torvalds were to release Linux under GPLv3 (he's not too hot on it, but he wants ZFS), the code could be mixed.

29 February 2008, 8:44 PM (5 years ago)report abuse Send to a friend reply

Pedro M. Rosario Barbosa:

That's a point I made above, the CDDL does *not* prevent its use in any GNU/Linux distribution. On the contrary, it is a free software/open source license. You are right when you say that ZFS's code cannot be combined with the kernel, but nothing prevents any distribution from using ZFS.

29 February 2008, 8:44 PM (5 years ago)report abuse Send to a friend reply

philc:

Since ext3, LVM and raid provide a comprable set of features, isn't a Linux "port" mostly a matter of writing a configuration application?

The ZFS features look exciting for file servers in a data center. I am not sure I would use it on my personal computers. Actually ext3 is fine for what I do.

29 February 2008, 8:31 PM (5 years ago)report abuse Send to a friend reply

Polaris_s0i:

no, not really.
For one, the LVM header on the disk is going to be different, so it maybe comparible, but not compatible.
If ZFS used the same disk headers and such as LVM2 then it's other features could be implimented as LVM addons.

From what I gather from the article, ZFS might be an offshoot of LVM, but not the same thing. From the way it sounds, ZFS has mixed the filesystem layer and volume-management layer together to create some features that would be impossible with LVM.

29 February 2008, 8:44 PM (5 years ago)report abuse Send to a friend reply

H.Kwint:

A uniform configuration tool for pooling, LVM, RAID, snapshots, clustering operations, resizing operations and plenty of different filesystems is available, and it's called EVMS.

It's made/maintained by IBM:
http://evms.sourceforge.net/

29 February 2008, 8:44 PM (5 years ago)report abuse Send to a friend reply

Daniel Aleksandersen:

It would be great if Linux and Mac could agree on using this format as a standard. It would make a stronger competition to Microsoft's file systems if they could both support the same format.

29 February 2008, 8:31 PM (5 years ago)report abuse Send to a friend reply

Shane Hamilton:

No one that is going to make the decision between Microsoft and Linux or Mac is going to even consider the file system in that decision. To us geeks it's very interesting, but once you're at your desktop how often do you really think "Man I love this filesystem"?

Filesystems have no or at the most negligable competition value.

29 February 2008, 8:44 PM (5 years ago)report abuse Send to a friend reply

Subie:

This fantastic filesystem is top knotch in Solaris (SPARC and x86), OS X, and FreeBSD. Linux: don't get left behind (and that doesn't mean half-speed FUSE).

29 February 2008, 8:31 PM (5 years ago)report abuse Send to a friend reply

Shaman:

Not only can FUSE be nearly as fast as any in-kernel file system (go look for tests) and usually loses only tiny amounts of data to latency, but OSX is a microkernel and by definition, runs all file systems as user-space processes. Yep. Go get an understanding of microkernels, plz kthx.

Lastly, Solaris runs ZFS as a quasi-userspace file system too. ZFS is basically a giant blob of FS code which passes messages to the kernel. That's why you can put ZFS in a FUSE module or into Apple's microkernel OS.

Not feeling so smart now, huh?

29 February 2008, 8:44 PM (5 years ago)report abuse Send to a friend reply

Subie:

I have ZFS running on it (surely you keep up with the news, right?). It's slow as molasses. Next please?

29 February 2008, 8:44 PM (5 years ago)report abuse Send to a friend reply

Epinomous:

No you don't, or you would have known that it is only proof-of-concept with zero optimization you have tried. NTFS-3G shows that FUSE can handle speed.

Now, if you actually had some knowledge and not only installed a package for Ubuntu...

So, thanks for playing, try again.

29 February 2008, 8:44 PM (5 years ago)report abuse Send to a friend reply

Warren:

As I read down your list of attractive features, I kept looking for encryption, and didn't see it. When every week brings another news story of a stolen laptop, etc. this is a rising issue.


29 February 2008, 8:31 PM (5 years ago)report abuse Send to a friend reply

Phillistine:

Can you write an article about file system encryption ? I have so many 'newbie' type question.

If you encrypt data on disk drive and the drive is stolen, is the decryption key on the same drive ? does the user have to memorize them ? are they generated from the password ? How do you make it so that if the whole computer were stolen, your data would still be safe ?

How does storage encryption work in practical terms ?

29 February 2008, 8:31 PM (5 years ago)report abuse Send to a friend reply

Chris Norton:

I've actually been really excited about ZFS for a while now, since it was announced in fact. Most of these features would be extremely welcome in my daily work. I'm not going to switch operating systems to get them though!

The problem with ZFS on Linux is not really the speed of the FUSE port. Given time that can be improved. No, the problem is that you really want ZFS to be a real filsystem that can be booted off. Mac OS X "Leopard" will also have this problem, as does the FreeBSD port (which is actually feature incomplete anyway). I'm not sure but I believe that some of the distros of OpenSolaris don't allow booting from a ZFS partition either. Hopefully these all get rectified.

With Sun's gradual move towards the GPL and its seeming interest specifically in GPLv3 I hope that we can see a native, GPL licensed, Linux port sometime soon. Even if ZFS doesn't excite the general computer-using public, it's features would make life easier for everyone.

29 February 2008, 8:31 PM (5 years ago)report abuse Send to a friend reply

Brendan:

I do believe that Nexenta (GNU/OpenSolaris) has native support for ZFS and booting from it, as it -is- Solaris' kernel, and not the Linux kernel, meaning both are under the CDDL.

29 February 2008, 8:44 PM (5 years ago)report abuse Send to a friend reply

anonymous user Anonymous user