Encrypt Your Data

The revelation of wide-spread government snooping has sparked a renewed interest in data storage security via encryption. In this article, we review some options for encrypting files, directories, and filesystems on Linux.

Before, when I have written about encrypting data, it didn’t generate a great deal of long-term interest. Rightly or wrongly, Edward Snowden has brought the issue of data and data transmission security to the forefront. I argue that this topic always should be under discussion because data security, including encryption, is an important topic for everyone, even if it’s just your laptop. Now, many people are concerned about the US government getting access to their data and want to prevent this – or at the very least slow it down.

Personally, I don’t want the US government or anyone else learning that I like funny videos on YouTube and really old and really bad science fiction movies, or that my musical tastes run all over the map, or that I do all sorts of technical writing and have some definite opinions about things. No one should have access to that information unless I want to give it to them. I do not have a “cabin-in-the-woods” type of paranoia, but I see no reason for anyone to have access to this information unless I control when and whom.

Regardless of why you don’t want your digital life violated, I think that paying attention to your data and even encrypting it is important. Therefore, in this article, I want to review and update some ways to encrypt data. I’ll primarily be sticking to Linux filesystems, but some of these techniques can be used on Windows systems as well.

Encryption/Decryption

The whole concept of cryptology (hiding information or encrypting and decrypting it) is a very ancient concept. The battle continues between people who send data or information securely to the intended recipient versus people who get access that information and try to break the encryption (decrypt it). There are literally hundreds of books on the subject (a quick Amazon search turned up 3,998 results), and it is under constant research. Although not an authoritative source of information, one possible place to start reading is Wikipedia’s overview of cryptography and very brief article on encryption. Otherwise, Google is your friend.

If you want to learn about encryption a little more, there is a reasonable introduction that talks at a very high level about how encryption works. For a very simple introduction it shows how to do what is called a substitution cipher. The classic example for people reading this article and who remember usenet is called ROT13. This is a really simple example of encryption (but it is not strong encryption).

Software, Hardware, or Both?

Fundamentally, you really have two options for encrypting data: (1) hardware based and (2) software based. Although you could use both options in combination, that might be considered overkill (then again, in the current climate, maybe not).

Among the hardware options available, the one I want to mention is the Self-Encrypting Drive (SED). The concept is simple: Take an ordinary drive, add an encryption/decryption processor to it, add authentication to the firmware, and you have a SED. This approach has several benefits:

  • Encryption is always on, so it will affect data at rest (i.e., data stored on the drive).
  • Authentication is independent of the operating system (OS).
  • There are no encryption keys to manage (vendors use standard interfaces such as the BIOS or a software-based component that happens before the OS boots).
  • The encryption keys never have to leave the drive.
  • Relative to a non-SED, you will see no loss in performance with a SED

Typically the encryption keys are 128- or 256-bit Advanced Encryption Standard (AES) keys, which evidently is fairly strong encryption. (My apologies, but I’m not in a position to judge the quality of an encryption algorithm.)

Managing SEDs can be a little more difficult than non-SEDs because, when the system boots, you need to authenticate so the drives can then be used. In the case of a large distributed system, this can be a little cumbersome if the systems restart or boot regularly. Some vendors keep the keys on an out-of-band device so the drives can contact the device for them. Then, however, you have to ensure the keys on the device are encrypted as well.

SEDs have some vulnerabilities, but most involve having physical access to the drives. Again, I’m not a security person, so I cannot judge the level of protection – I can only state what technologies are available.

Software-based approaches basically have three options for encrypting your data on a Linux system. The options are (1) encrypting a single file, (2) encrypting a directory (with or without a virtual disk), or (3) encrypting a physical block device.

Encrypting files is fairly straightforward and various tools are available to do this. For example, bcrypt, NCrypt, and 7-Zip compress and encrypt files using 256-bit AES. The most popular tool is probably GnuPG, which comes with just about every Linux distribution. Note that all of these tools encrypt data once the system has booted and the OS is operating. If the system has been compromised, then your encryption may be pointless because the attacker can “sniff,” or log, your passcodes and decrypt your files.

In this article, I focus on encrypting directories and filesystems. Several ways to encrypt filesystems or partitions are at hand; consequently, this article isn’t intended to be an exhaustive listing of options or a how-to on the various options. Rather, it’s intended to whet your appetite, so you can explore the details of the various options yourself. As with all new topics around data and storage, be sure to back up your data before trying something new.

Encrypting Directories or Filesystems

The process of encrypting a directory tree or a filesystem does not focus on the underlying block device(s). This approach is good if you only want to encrypt certain portions of your tree, such as everything in /home or /home/laytonjb/Music (so no one can see my David Hasselhoff music files). This means you also don’t encrypt the OS filesystem, which is a reasonable place to start with data encryption.

Wikipedia is perhaps not the most authoritative source of information about cryptography or encrypted filesystems, but you can find a simple list of encryption filesystems there that, although incomplete, provides a starting point.

I divide filesystem encryption options into two parts. First, I discuss “stacked” filesystems, sometimes also called meta-filesystems. Stacked filesystems are the typical target for software encryption of directories or filesystems.

Stacked Encrypted Filesystems

A stacked filesystem comprises filesystems built on top of other filesystems. Gluster and Lustre are two examples of stacked filesystems that quickly spring to mind. In the case of data encryption, the encrypted filesystem uses a “lower” type of filesystem (FS), such as XFS or ext4, to store encrypted data. To a user, the data appears to be gibberish, meaning it is encrypted. You have to “mount” the encrypted filesystem to be able to decrypt the data so that it makes sense.

Two prominent stackable Linux filesystems are EncFS and eCryptfs. I’ll start with EncFS because it can be mounted by a user, giving it enormous flexibility.

EncFS

EncFS is an encrypted virtual filesystem (VFS). It has been around for a while, but its last update appears to have been in 2010. I’m not sure if it’s still under active development, but I know people still use it. EncFS encrypts the data a file at a time and then stores it on a typical Linux filesystem such as XFS or ext4. The author of EncFS refers to it as a “pass-through” filesystem rather than an encrypted block device (such as a SED). I just think of it as a stackable filesystem or a meta-filesystem.

EncFS is a FUSE-based userspace filesystem. The FUSE kernel module allows access to the VFS in the kernel. Consequently, you can create a filesystem entirely in user space using the FUSE API.

Some advantages of using the stackable filesystem approach include the ability to encrypt a directory tree or even an entire filesystem. Plus, you can just keep adding data to the encrypted filesystem until you run out of space on the underlying FS. Moreover, because EncFS deals with files and directories, you can keep using your normal backup tools. If the files change, the backup tool detects that and does the appropriate backup, but the backup just shows encrypted files (gibberish).

The requirements for EncFS are pretty simple:

  • FUSE version 2.6 or newer.
  • rlog – C++ logging library.
  • OpenSSL versions 0.9.6 through 0.9.8 (other version are untested).
  • boost – a C++ utility library, version 1.34 or later.

After ensuring your system fulfills these requirements, just download and build EncFS, or you can use your package management tool to install these dependencies and EncFS.

Once everything is installed, you should run two quick commands as root so you can use EncFS as a user:

# usermod -a -G fuse laytonjb
# chmod +x /usr/bin/fusermount

The next step is to create two directories in your account to contain the encrypted and decrypted data.

$ mkdir -p encrypted
$ mkdir -p decrypted

The first directory will contain the data in encrypted form, and the second directory is where you put data to be encrypted (i.e., non-encrypted data). With these two directories, you can run the encfs command to “mount” the EncFS filesystem:

$ encfs ~/encrypted ~/decrypted
Creating new encrypted volume.
Please choose from one of the following options:
 enter "x" for expert configuration mode,
 enter "p" for pre-configured paranoia mode,
 anything else, or an empty line will select standard mode.
?> 
 
Standard configuration selected.
 
Configuration finished.  The filesystem to be created has
the following properties:
Filesystem cipher: "ssl/aes", version 3:0:2
Filename encoding: "nameio/block", version 3:0:1
Key Size: 192 bits
Block Size: 1024 bytes
Each file contains 8 byte header with unique IV data.
Filenames encoded using IV chaining mode.
File holes passed through to ciphertext.
 
Now you will need to enter a password for your filesystem.
You will need to remember this password, as there is absolutely
no recovery mechanism.  However, the password can be changed
later using encfsctl.
 
New Encfs Password: 
Verify Encfs Password: 

I used the default options (by pressing Enter). Notice that I also entered a pass phrase that I need to mount the encrypted filesystem, then I can check to see whether the filesystem is mounted.

$ mount
/dev/sda2 on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /boot type ext2 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
encfs on /home/laytonjb/decrypted type fuse.encfs (rw,nosuid,nodev,default_permissions,user=laytonjb)
 
$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2              53G   35G   15G  70% /
tmpfs                 3.6G  536K  3.6G   1% /dev/shm
/dev/sda1             485M   73M  387M  16% /boot
encfs                  53G   35G   15G  70% /home/laytonjb/decrypted

The EncFS filesystem is there. Notice that I did all of this as a user – no root access is required. This means that as a user I can control what data is encrypted.

I don’t want to get into a tutorial on EncFS, but I’ll show how it works because other encrypted filesystems are similar. First, I’ll create a couple of files and a symlink in the decrypted filesystem (~/decrypted).

[laytonjb@test1 ~]$ cd decrypted
[laytonjb@test1 decrypted]$ echo "hello foo" > foo
[laytonjb@test1 decrypted]$ echo "hello bar" > bar
[laytonjb@test1 decrypted]$ ln -s foo foo2
[laytonjb@test1 decrypted]$ ls -l
total 8
-rw-rw-r-- 1 laytonjb laytonjb 10 Sep  4 15:12 bar
-rw-rw-r-- 1 laytonjb laytonjb 10 Sep  4 15:12 foo
lrwxrwxrwx 1 laytonjb laytonjb  3 Sep  4 15:12 foo2 -> foo

The results of the ls -ltells me that my access is to the decrypted data. If I change directories to the encrypted directory and try a similar command, I would see the following:

[laytonjb@test1 decrypted]$ cd ~/encrypted
[laytonjb@test1 encrypted]$ ls -l
total 8
lrwxrwxrwx 1 laytonjb laytonjb 24 Sep  4 15:12 acS5u3K9TJ,9FWTDUq0yWqx6 -> XuD50Mah2kp2vukDeo04cOv,
-rw-rw-r-- 1 laytonjb laytonjb 18 Sep  4 15:12 WvPjlWtCaq5g9hE1NHMI3lfi
-rw-rw-r-- 1 laytonjb laytonjb 18 Sep  4 15:12 XuD50Mah2kp2vukDeo04cOv,

Although the files are there, the names and data are encrypted and consequently unintelligible.

Next I’ll unmount EncFS and look at the files in ~/encrypted and ~/decrypted:

[laytonjb@test1 ~]$ fusermount -u ~/decrypted
[laytonjb@test1 ~]$ mount
/dev/sda2 on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /boot type ext2 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
[laytonjb@test1 ~]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2              53G   35G   15G  70% /
tmpfs                 3.6G  536K  3.6G   1% /dev/shm
/dev/sda1             485M   73M  387M  16% /boot
[laytonjb@test1 ~]$ cd decrypted
[laytonjb@test1 decrypted]$ ls -l
total 0
[laytonjb@test1 decrypted]$ cd ../encrypted
[laytonjb@test1 encrypted]$ ls -l
total 8
lrwxrwxrwx 1 laytonjb laytonjb 24 Sep  4 15:12 acS5u3K9TJ,9FWTDUq0yWqx6 -> XuD50Mah2kp2vukDeo04cOv,
-rw-rw-r-- 1 laytonjb laytonjb 18 Sep  4 15:12 WvPjlWtCaq5g9hE1NHMI3lfi
-rw-rw-r-- 1 laytonjb laytonjb 18 Sep  4 15:12 XuD50Mah2kp2vukDeo04cOv,

In the first line I used the command fusermount -u to unmount the filesystem because it’s a userspace filesystem. Also notice that the decrypted part of the filesystem is not visible anymore (i.e., the files aren’t there), but the encrypted directory still has the encrypted files. When EncFS is used a again with the same decrypted and encrypted directories, it will automatically decrypt the files and allow you to access them in the ~/decrypted directory.

The first key point about using EncFS is that a user can use without administrator intervention. I believe this is a strength of EncFS, because it allows each user to control what they do and don’t want to encrypt. Theoretically this reduces the number of people who are in the middle of encrypting data.

The second thing to notice is that when EncFS is mounted, you can view the decrypted data in the decrypted directory. If the system is compromised, then an intruder can access your data. EncFS is not designed to encrypt data from an intruder, but you should be aware that the data can be “seen” by someone when EncFS is mounted (they just go to the ~/decrypted directory).

The third thing to notice is that although the file name and data are encrypted, some of the basic metadata information is not. Someone can still see that you have a certain number of files and that you own them. They can also see file permissions. This information might not seem valuable, but it could be useful in some situations. This is one of the limitations of EncFS.

eCryptfs

One of the best known Linux filesystem encryption solutions is called eCryptfs (Enterprise Cryptographic Filesystem). It has been in the kernel since version 2.6.19 and is fully configured in some distributions. The eCryptfs filesystem stacks on top of other filesystems (called the lower filesystem), such ext2/3/4, JFS, XFS, and others (i.e., any filesystem that has extended attributes). It encrypts and decrypts the files as they are being written to, or read from the lower filesystem by operating on the files one at a time instead of at a block device or partition level. The metadata associated with the file is stored with the file itself on the lower filesystem. This can make the encrypted files a little larger than the decrypted version, but there are some clear advantages to this approach:

  • It allows files from different users to use different encryption keys, controlling access to the data.
  • You can move or copy the files to a different location, and they can be decrypted with the correct key (encryption of whole partitions or devices requires a different process before the files can be accessed).
  • You can give the file to other users, and they can decrypt it as long as the correct key is supplied as well.
  • You can use typical incremental backup processes because they can easily detect differences in files in the lower filesystem. This is almost impossible with encrypted partitions or block devices.

A few quick disadvantages:

  • The CPU cycles required to perform the encryption are not small.
  • Because of the time involved to encrypt the filesystem, the speed (performance) of the encrypted filesystem is less than the lower filesystem. In fact, it can be quite a bit less in some cases, so only use an encrypted filesystem if performance is not a key consideration and security (encryption) is an extremely important requirement.

The kernel component of eCryptfs has been in the kernel since 2.6.19. The userspace tools for eCryptfs can be obtained from the eCryptfs site. Follow the directions from the website that explain how to build and install the tools.

To use eCryptfs, it must be active either in the kernel or as a module. A simple way to get started is just to mount an eCryptfs filesystem in your account (i.e., a single directory). This has to be done as root because you are mounting a filesystem:

[root@test1 laytonjb]# mount -t ecryptfs /home/laytonjb/private /home/laytonjb/private
Select key type to use for newly created files: 
 1) tspi
 2) passphrase
 3) openssl
Selection: 2
Passphrase: 
Select cipher: 
 1) aes: blocksize = 16; min keysize = 16; max keysize = 32 (not loaded)
 2) blowfish: blocksize = 16; min keysize = 16; max keysize = 56 (not loaded)
 3) des3_ede: blocksize = 8; min keysize = 24; max keysize = 24 (not loaded)
 4) cast6: blocksize = 16; min keysize = 16; max keysize = 32 (not loaded)
 5) cast5: blocksize = 8; min keysize = 5; max keysize = 16 (not loaded)
Selection [aes]: 1
Select key bytes: 
 1) 16
 2) 32
 3) 24
Selection [16]: 2
Enable plaintext passthrough (y/n) [n]: n
Enable filename encryption (y/n) [n]: n
Attempting to mount with the following options:
  ecryptfs_unlink_sigs
  ecryptfs_key_bytes=32
  ecryptfs_cipher=aes
  ecryptfs_sig=f1c2c0b669730359
WARNING: Based on the contents of [/root/.ecryptfs/sig-cache.txt],
it looks like you have never mounted with this key 
before. This could mean that you have typed your 
passphrase wrong.
 
Would you like to proceed with the mount (yes/no)? : yes
Would you like to append sig [f1c2c0b669730359] to
[/root/.ecryptfs/sig-cache.txt] 
in order to avoid this warning in the future (yes/no)? : yes
Successfully appended new sig to user sig cache file
Mounted eCryptfs

When you mount for the first time, eCryptfs asks you a number of questions about security and encryption. In the beginning, you can easily accept the defaults. Notice that I’ve used eCryptfs before, so it asked me about the different passphrase.

In the mount command, the first path is the lower directory (the lower filesystem where the data is actually stored). In this case, the full path is used. The second path is the eCryptfs mountpoint. For this particular example, the lower filesystem and the mountpoint are the same, although they don’t have to be. Any file that is written to /home/laytonjb/private is encrypted and written to /home/laytonjb/private on the lower filesystem. Effectively, it looks like the directory /home/laytonjb/private is encrypted. Note that if I already had files in the directory /home/laytonjb/private, I wouldn’t be able to access them because eCryptfs was mounted over it. This is the same behavior when mounting a filesystem for any mountpoint – any files in the mountpoint are no longer accessible.

You can check that the filesystem is mounted with the mount command:

[root@test1 laytonjb]# mount
/dev/sda2 on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /boot type ext2 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
gvfs-fuse-daemon on /home/laytonjb/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev,user=laytonjb)
/home/laytonjb/private on /home/laytonjb/private type ecryptfs (rw,ecryptfs_sig=f1c2c0b669730359,ecryptfs_cipher=aes,ecryptfs_key_bytes=32,ecryptfs_unlink_sigs)

Notice the eCryptfs filesystem in the last line. If I do an ls on the directory, you can see that it looks just like any other directory:

[laytonjb@test1 ~]$ ls -lstar ~/private
total 8
4 drwxrwxr-x   2 laytonjb laytonjb 4096 Sep  6 13:10 .
4 drwx------. 46 laytonjb laytonjb 4096 Sep  6 13:19 ..

Now I’ll create a couple of files and a symlink in the directory:

[laytonjb@test1 ~]$ cd private
[laytonjb@test1 private]$ echo "hello foo" > foo
[laytonjb@test1 private]$ echo "hello bar" > bar
[laytonjb@test1 private]$ ln -s foo foo2
[laytonjb@test1 private]$ ls -l
total 24
-rw-rw-r-- 1 laytonjb laytonjb 10 Sep  6 13:23 bar
-rw-rw-r-- 1 laytonjb laytonjb 10 Sep  6 13:23 foo
lrwxrwxrwx 1 laytonjb laytonjb  3 Sep  6 13:23 foo2 -> foo
[laytonjb@test1 private]$ cat bar
hello bar

The encrypted filesystem is mounted, so I can easily create files that appear to be decrypted in that directory. However, if I umount the filesystem, you will see something different (note: root has to umount the filesystem):

[root@test1 laytonjb]# umount /home/laytonjb/private
[root@test1 laytonjb]# mount
/dev/sda2 on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /boot type ext2 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
gvfs-fuse-daemon on /home/laytonjb/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev,user=laytonjb)
[laytonjb@test1 ~]$ cd private
[laytonjb@test1 private]$ ls -lstar
total 32
12 -rw-rw-r--   1 laytonjb laytonjb 12288 Sep  6 13:23 foo
12 -rw-rw-r--   1 laytonjb laytonjb 12288 Sep  6 13:23 bar
 0 lrwxrwxrwx   1 laytonjb laytonjb     3 Sep  6 13:23 foo2 -> foo
 4 drwxrwxr-x   2 laytonjb laytonjb  4096 Sep  6 13:23 .
 4 drwx------. 46 laytonjb laytonjb  4096 Sep  6 13:26 ..
[laytonjb@test1 private]$ cat foo
 
�
 ��ՍW0�-   "3DUfw`�.�}�M�>

The directory private is where the encrypted files are stored; notice the file output is gibberish, indicating encryption.

To decrypt a file in this example, just copy it from the mountpoint to a non-encrypted directory:

[laytonjb@test1 private]$ cp /home/laytonjb/private/foo /home/laytonjb/public/foo.decrypted

eCryptfs will decrypt the file foo, and it will be put in the directory /home/laytonjb/public as foo.decrypted, which is not an encrypted filesystem.

This quick example shows how to create an encrypted directory in your /home account. It is also possible to encrypt a user’s entire /home by root or by the user (as long as they have permission for the mountpoint – which they should). Finally, some blogs and tutorials provide more detail on how to use eCryptfs:

Summary

As with everything else, encrypting your data has its pluses and minuses. On the plus side are the obvious benefits of having your data encrypted so prying eyes can’t make heads or tails of it. They may be able to copy it, but either they will have to know how to crack the encryption or they will have to “brute force” the decryption using lots of computational power. On the minus side, encryption will slow down the system I/O because of the computational load. Also, if you happen to forget your passphrase, you won’t be able to access your data again (unless you crack the algorithm or the passphrase). Encryption also makes data portability a bit more difficult. However, given the overall scrutiny of data security, these minuses might not be such a bad thing.

In this article, I presented two options for encrypting files, directories, or filesystems. The ability to encrypt these structures, in my opinion, provides a great deal of flexibility by allowing you to keep some data encrypted and some decrypted. For example, I might not want people to know my taste in music, but I might not mind if they see email, some documents, or my fantasy football picks. As a user, allowing me to control what is encrypted helps me if I forget my passphrase, because in that case, I will not have lost everything.

In the first option, EncFS, encryption is in user space under the control of the user – bypassing requests, approval, and administration scheduling – so data can be encrypted whenever needed. One benefit of this is that fewer people are involved in the data encryption process, which can help with security in some cases.

The second option, eCryptfs, also allows you to encrypt files, directories, and filesystems, but requires administrator intervention for the creation and mounting of an encrypted directory. Many people use eCryptfs for encrypting their entire /home directory because it is so easy to use.