ZFS: what "the ultimate file system" really means for your desktop -- in plain English!
So, Sun's ZFS file system has garnered publicity recently with the announcement of its inclusion in Mac OS X and, more recently, as a module for the Linux kernel. But if you don't readFilesystems Weekly, what is it and what does it mean for you?
Now I may just be showing my geek side a bit here, but file systems are awesome. Aside from the fact our machines would be nothing without them, the science behind them is frequently ingenious.
And ZFS (the Zettabyte File System) is no different. It has quite an extensive feature set just like its peers, but builds on this by adding a new layer of simplicity. According to the official site
, ZFS key features are (my summary):
- Pooled storage -- eliminates the concept of partitions and volumes, everything comes from the same pool regardless of the number of physical disks. Additionally, "combined I/O bandwidth of all devices in the pool is available to all file systems at all times."
- All operations are copy-on-write transactions -- this is a tad beyond me, but the key phrase here -- no need to ever fsck (chkdsk, for you Windows people or Disk Utility repair for Mac heads) the file system.
- Disk scrubbing -- live online checking of blocks against their checksums, and self-repair of any problems.
- Pipelined I/O -- I'll let the website explain "The pipeline operates on I/O dependency graphs and provides scoreboarding, priority, deadline scheduling, out-of-order issue and I/O aggregation." Or in short, it handles loads well.
- Snapshots -- A 'read-only point-in-time copy of a filesystem', with backups being driven using snapshots to allow, rather impressively, not just full imaging but incremental backups from multiple snapshots. Easy and space efficient.
- Clones -- clones are described as 'writable copies of a snapshot' and could be used as "an extremely space-efficient way to store many copies of mostly-shared data" which sounds rather like clones referencing snapshots and storing only the data changed from the snapshot. On a network with imaged desktops for eg, this would mean having only one original snapshot image, and dozens of clones that contain only modified data, which would indeed be extremely space efficent compared to storing dozens of snapshots.
- Built-in compression -- granted this is available for most other filesystems, usually as an addon, and it's good point raised that compression can not only save space, but I/O time.
- Simple administration -- one of the biggest selling points. This is quite important because some of the above features, such as pooling, can indirectly be achieved now with other filesystems -- layering LVM, md RAID and Ext3 for example. However for each of these there is an entirely different set of commands and tools to setup and administer the devices -- what ZFS promises is that these layers can all be handled from the one toolset, and in a simpler manner, as this PDF tutorial demonstrates.
All up, as a geek, it's an exciting file system I'd love to play with -- currently however ZFS is part of Sun's Solaris, and under the CDDL (Common Development and Distribution License), which is actually based on the MPL (Mozilla Public License). As this is incompatible with the GPLv2, this means the code can't be ported to the Linux kernel. However, this has recently been satisfied by porting it across as a FUSE module but, being userspace, is slow though there hope this will improve. Looks like it's time to enable FUSE support in my kernel!
Of course, (in a few months time) you could also go for Mac OS X where, in Leopard, ZFS is already supported and there are rumours Apple may be preparing to adopt it as the default filesystem replacing the aging HFS+ in the future (but probably not in 10.5).
But then you'd have to use a Mac. Ewww.
(I jest I jest, put down your pitchforks Macites!)
|ZFS in Leopard: A popular screenshot from a development release