My name is Philipp C. Heckel and I write about nerdy things.
This site moved here recently from blog.philippheckel.com!

How-To: Using ZFS Encryption at Rest in OpenZFS (ZFS on Linux, ZFS on FreeBSD, …)


Administration, Linux

How-To: Using ZFS Encryption at Rest in OpenZFS (ZFS on Linux, ZFS on FreeBSD, …)


An upcoming feature of OpenZFS (and ZFS on Linux, ZFS on FreeBSD, …) is At-Rest Encryption, a feature that allows you to securely encrypt your ZFS file systems and volumes without having to provide an extra layer of devmappers and such. To give you a brief overview of what the feature can do, I thought I’d write a short post about it.

The current ZFS encryption implementation is not (yet) merged into the upstream repository (as of January 2017). There is a pretty big pull request which is still being reviewed, but because the feature is so incredibly cool (and because my colleague Tom Caputi at Datto developed it), I thought a sneak preview is absolutely necessary.


Content


0. This post

This post demonstrates a feature that has not yet been released. For the demos I will focus on ZFS on Linux on an Ubuntu 16.04 based machine. The code I run is available in the official repos (links below) or in my forks (SPL & ZFS, use branch “blogpost” for both).

1. Introduction

At-rest encryption is a new feature in ZFS (zpool set feature@encryption=enabled <pool>) that will automatically encrypt almost all data written to disk using modern authenticated ciphers (AEAD) such as AES-CCM and AES-GCM.

The CLI makes it incredibly easy to enable encryption on a per dataset/volume basis (zfs create -o encryption=on <dataset>). The keys used for encryption can be inherited or are manually set for a dataset. Keys can be loaded from different sources (prompt or file) and various input formats are available (raw, hex or passphrase). Keys and key sources can be changed after the dataset/volume creation, and without re-encrypting the data (as they are never used directly).

The encryption parameters and key status of a dataset/volume are represented in various properties (encryption=<on|aes-128-gcm|...>, keysource=<raw|hex|passphrase>,<prompt|file>, keystatus=<none|available|unavailable>, pbkdf2iters=<n>).

Many normal ZFS commands are available even if the key of a dataset is not loaded, meaning that administrators can manage the pool without having to know the keys. For instance: a pool can be scrubbed (zpool scrub <pool>) without the keys, and datasets and snapshots can be listed (zfs list -rt). In future releases, zfs send and zfs recv will also work even if the key is not available.

Having built-in support for encryption at the file system level is huge. It means that you no longer have to use dm-crypt if you want to encrypt your data on disk, and you can still manage your pools even if keys are not loaded. Many thanks to Datto and Tom Caputi for bringing us this incredible feature.

1.1. What’s encrypted

All important pieces are encrypted (actual data and metadata, ACLs, permissions, directory listings, …), while some things are unencrypted to allow managing pools more easily.

Here’s a listing of what’s encrypted and what’s not:

Encrypted Not Encrypted
  • File data and metadata
  • ACLs, names, permissions, attrs
  • Directory listings
  • All Zvol data
  • FUID Mappings
  • Master encryption keys
  • All of the above in the L2ARC
  • All of the above in the ZIL
  • Dataset / snapshot names
  • Dataset properties
  • Pool layout
  • ZFS Structure
  • Dedup tables
  • Everything in RAM

1.2. Crypto Details

Note: This section describes the nitty gritty crypto details. You can safely skip it if you just want to use the feature.

Crypto concepts are always a bit hard to explain without confusing everyone. Tom has done an excellent job explaining the ZFS encryption crypto concept in his talk and it is visualized very nicely in his slides (PDF; or on Google Drive: original, mirror).

If you don’t have the time to watch the entire talk, let me try to summarize the concepts one of his slides:

Normal / non-dedup case: Before the plaintext block data (or metadata) is written, it is encrypted using AES (in CCM or GCM mode, depending on the -o encryption=.. property) with a 128/192/256-bit encryption key (default is AES-CCM-256). The 96-bit initialization vector (IV) used for CCM/GCM is randomly generated using the standard linux PRNG, and it is never reused. The encryption key itself is derived from the encrypted master key (see below) using the key derivation function HKDF. The 64-bit salt used for HKDF is randomly generated (using the above mentioned PRNG) and stored with the encryption key in a volatile salt cache. The encryption key is reused (for performance reasons) until it goes stale.

Dedup case: If deduplication is enabled, the algorithm behaves slightly differently, because it has to produce the same ciphertext for the same plaintext (given the same master key). To achieve that, the salt and the IV are not randomly generated, but instead a 160-bit HMAC of the plaintext is used: the first 64 bits are used as the salt, the remaining 128 bits are used as IV. The 256-bit HMAC key is randomly generated (using above mentioned PRNG), and stored alongside the master key.

Master key: The master key is randomly generated (using above mentioned PRNG) and it is never exposed to the user directly. Instead, the master key is encrypted (with the same cipher and mode with a 256-bit key) using a user provided wrapping key. This wrapping key is provided via a file (as hex or raw, see -o keysource=.. property) or via a password prompt. If a passphrase is supplied by the user, the wrapping key is derived using the password-based key derivation function PBKDF2 (using 100k iterations by default, or whatever you specify in the property -o pbkdf2iters=..).

If you want to know more, I highly suggest watching Tom Caputi’s ZFS encryption talk, or reviewing the slides (PDF; or on Google Drive: original, mirror).

2. Using ZFS encryption

Using the encryption feature is pretty simple. All the relevant commands and properties are described in great detail in the ZFS man page (man zfs), but here’s an excerpt of what you need to know.

2.1. Enabling the feature on the pool

Assuming you have installed a version of ZFS with encryption installed (if not, follow the steps in section compile and install at your own risk), you need to turn it on for your pool. I’ll create a test pool called testpool for this post:

2.2. Creating an encrypted dataset

Once you’ve enabled the feature on the pool, you can create encrypted datasets and volumes. To do that, you need to pass the two properties -o encryption=.. -o keysource=.. to the zfs create command. Depending on your preferences, you may also pass -o pbkdf2iters=..:

The -o encryption=.. property controls the ciphersuite (cipher, key length and mode). The default is aes-256-ccm, which is used if you specify -o encryption=on.

The -o keysource=.. property controls what format the encryption key will be provided as and where it should be loaded from. The key can be formatted as raw bytes, as hex representation or as a user password. It can be provided via a user prompt which will pop up when you first create it, or when you mount the dataset (zfs mount) or load the key manually (zfs key -l). Unless you want to automate things, -o keysource=passphrase,prompt seems like a good option.

The -o pbkdf2iters=.. property is only used if a passphrase is used (-o keysource=passphrase,..). It controls the iterations of PBKDF2. Higher is better as it slows down potential dictionary attacks on the password. The default is -o pbkdf2iters=100000.

Here are a few examples of how to create encrypted datasets and volumes (ZVOLs):

Creating an encrypted dataset, using the defaults:

Creating an encrypted child dataset, which inherits all parameters and keys from its parent:

Creating an encrypted dataset, using AES/GCM with 128-bit key loaded from a file (encoded as hex):

Creating an encrypted dataset, using AES/GCM with a 256-bit key loaded from a file (not encoded):

Creating an encrypted ZFS volume (ZVOL), using the defaults with one million PBKDF2 rounds:

2.3. Reading the encryption properties

Once you’ve created a dataset or volume, you can query its encryption properties like you normally would:

2.4. Importing a pool, mounting datasets and loading keys

If a dataset is encrypted, the read-only property keystatus represents the status of the key, and thereby also whether the dataset can be used (mounted, written to, read from …). It can be either off (unencrypted dataset), available (the key is loaded) or unavailable (the key is not loaded).

When a pool is imported using zpool import, encrypted datasets are left unmounted, because their keys are not automatically loaded. Only if the -l option is passed will encrypted datasets be loaded (if they can):

Instead of using zpool import -l ..., you can manually load the keys for individual datasets and volumes using zfs key -l:

If zfs mount is called on an encrypted dataset with unavailable key, it will prompt you:

Unloading a key of a mounted dataset won’t work, because it’s still in use. The dataset has to be unmounted first:

That’s essentially all the magic. If you want to know more, I suggest reading the ZFS man page (man zfs).

3. Compile and install

If you want to try the current implementation (before it is released), here are a few steps to compile and install it yourself on an Ubuntu 16.04-based system. Other systems will be very similar, but not identical. You can consult the Building ZFS wiki page for details.

Warning: Please be sure to only perform these steps on a test machine or a throw-away VM, because this will replace the ZFS kernel modules.

First, install all the build dependencies:

Once that’s done, compile and install SPL using the steps below. All relevant ZFS encryption pulls have been merged (as of January 2017), so this should “just work”. If it doesn’t (e.g. because the code has changed; you may be reading this in the future …), you may want to use my forked version instead (see SPL and ZFS, use branch “blogpost” for both):

Next, compile and install ZFS using Tom’s fork. If the ZFS encryption pull request has been merged, you may just want to use the upstream master branch:

Now the ZFS modules should be built in /lib/modules/$(uname -r)/extra, so all you have to do is load them. Be sure to remove the old modules first:

If that succeeded, be sure to yell “hooray” before you move on, because these steps took me a while to get right when I did it for the first time.

You can now use the feature (as described above).

4. FAQ

4.1. Is deduplication supported?

Yes, it is supported. However, there is some information that is leaked due to the nature of deduplication. See more in Tom’s talk.

4.2. Can I change the password? Will data be re-encrypted?

Yes, the password can be changed. The data does not get re-encrypted, because the password is merely used to decrypt a master key.

4.3. Does it work with TPM?

No, not yet.

One Comment

  1. pp

    1. Thanks for the guide – however you forgot a ‘make’ in the installation instructions for ZFS, right before ‘make install’
    2. Have you already tried out booting from an encryption-enabled pool with grub? Not sure if I’m doing anything wrong, but grub refuses to detect ZFS as my root filesystem :(


Leave a comment

I'd very much like to hear what you think of this post. Feel free to leave a comment. I usually respond within a day or two, sometimes even faster. I will not share or publish your e-mail address anywhere.