The Zettabyte File System (ZFS) is coming to Mac OS X – what is it?

Since Mac OS 8.1 (nine years ago) Apple OS has run on the HFS+ filesystem (which in turn is based on the 22 year old HFS), but maybe soon we will see a major upgrade with the introduction of the Zettabyte File System (ZFS). ZFS is very powerful for a number of reasons – and could make a huge difference to the user experience.

ZFS is a 128-bit file system, which means it can store 18 billion billion (18.4 × 1018) times more data than the current 64-bit systems. The limitations of ZFS are designed to be so large that they will never be encountered in practice, as an example of how large these numbers are, if 1,000 files were created every second, it would take about 9,000 years to fill the file system. As project leader Bonwick said:

Populating 128-bit file systems would exceed the quantum limits of earth-based storage. You couldn’t fill a 128-bit storage pool without boiling the oceans.” [Seth Lloyd “Ultimate physical limits of computation”
Nature 406,1047-1054 (2000)]

There are, however, a number of other notable features:

Pooled storage
ZFS can span a file system seamlessly across multiple disks and more can be added at anytime. This is good because it means a new hard disk can be added at any time, thereby adding redundancy and increasing performance by spreading i/o access across multiple disks. But is also improves the UX because users don’t have to worry about volumes, they just have storage.

Stability and data integrity
ZFS provides three core components to its data integrity model:

  • Everything is copy-on-write which means live data is never overwritten
  • Everything is transactional – sets of changes either suceed or fail as a whole
  • Everything is checksummed – preventing silent data corruption

All this results in an incredibly robust filesystem, during Sun’s tests [pdf] it has been subjected to over a million forced, violent crashes without losing data integrity or leaking a single block.

The use of checksums on all data and metadata allows for ‘self healing‘ – ZFS can repair (using the data from the other mirror) silent data corruption by detecting the corruption before passing the data of to the process that asked for it.

ZFS self healing

Snapshots

A snapshot is a copy of the entire file system, snapshots are not the same as backups, the two most significant differences are efficiency and speed.

A snapshot only stores the individual disk blocks that have changed, this means that a snapshot uses far less disk space than a traditional backup. Snapshots also happen instantaneously regardless of the size of the file system size, indeed the time it takes to create a snapshot is often so small that there appears to be no delay.

So what might this all mean?

Beyond the obvious benefits related to performance and data integrity there may also be important UX considerations.

I’ve written previously about the issues of the two copy file system, now the ZFS’s use of snapshots would mean that there would be very little performance or storage overhead in automatically versioning data. This would mean Apple could remove the Save dialogue box from much of the UI; files could automatically be safely saved in the background with old versions retrieved via Time Machine as needed thereby removing the need for explicit saves and hiding more of the filesystem from the user.

One thought on “The Zettabyte File System (ZFS) is coming to Mac OS X – what is it?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s