On-line Encrypted Backups for Your Laptop
Information on a filesystem can be encrypted to protect against unintended disclosure when a laptop is stolen; however, doing so doesn't allow you to access the files you've been working on if someone steals your laptop. If you have been traveling for a few weeks, making modifications to source code or office documents with your laptop, and it is lost or stolen, you still need to be able to access those updated files when you return home. On the other hand, if your laptop isn't stolen, you probably would like the peace of mind knowing the hard disk in the laptop is not the single point of failure for your important changes.
This article describes how to set up a system allowing backups to one or more on-line storage providers. You can choose either a free on-line storage provider or a paid service, depending on the consequences of losing your data or not having guaranteed immediate access to your backups.
You might find that many Internet connections made available to you when traveling have a very “protective” packet filtering system. For example, some hotels will filter all traffic that is not HTTP or HTTPS. Many on-line storage systems are made accessible over HTTP using the same HTTP operations performed by Web browsers. So, you still can upload your changes even when using very restrictive Internet connections. In this situation, other solutions, such as direct use of rsync over SSH, most likely will be filtered out.
One of the combinations described here should work with the most restrictive Internet connections. Two applications of on-line backups come to mind. If you are working on some documents or a smallish code tree, using Omnidrive for storage is a good free backup solution. If you have a nice digital camera and an on-line storage space that is larger, you can back up digital pictures incrementally as you travel. So, if the external 80GB drive to which you transfer your digital pictures goes missing, you won't lose your memories. The latter requires a reasonably fast (and free-of-charge) Internet connection at your hotel and that you leave the laptop on to upload overnight.
The key to making access to storage easy is using FUSE to mount the on-line storage service. Using FUSE makes all storage services look the same (or similar, to be more accurate) to the higher-level encryption and synchronization software. However, some FUSE filesystems for mounting on-line storage offer slightly different implementations, which might require some working around at higher levels.
Because you are backing up important data to a server you don't control or perhaps fully trust, the next layer should provide security to your precious data. The eCryptfs filesystem was described in the April 2007 issue of Linux Journal. EncFS is a FUSE filesystem offering filesystem encryption. Both eCryptfs and EncFS take an existing filesystem (the base) and offer a new filesystem (the encrypted filesystem). Any data that is written to the encrypted filesystem is encrypted transparently and stored onto the base filesystem. Reading data will decrypt the information transparently from the base filesystem.
So, you can have storage mounted as FUSE (call this ~/rawfs) and then remounted with EncFS (at another mountpoint, ~/backupfs). Files copied to ~/backupfs are encrypted and saved to ~/rawfs, which then saves them to the on-line storage (Omnidrive, GMailfs, sshfs, Openomy, Amazon S3—whichever you mount using FUSE at ~/rawfs).
The simplest way to keep your backup fresh is to use rsync(1) from your local data (perhaps in ~/documents) to your encrypted on-line filesystem.
Testing for this article was performed on a Fedora 7 machine. Some of the commands shown here, such as package installation commands, may be specific to the Fedora distribution.
Depending on your Linux distribution, you may need to add your user to the fuse group to be able to mount FUSE filesystems as a nonroot user. On Fedora 7, you would run the following command to enable the user ben to mount FUSE filesystems:
usermod -a -G fuse ben
Next, let's examine some different on-line storage providers and how to mount them with FUSE.
OmniFS allows you to mount the Omnidrive storage provider as a FUSE filesystem. Installation and use of OmniFS goes like this:
$ tar xjvf omnifs-0.3.0.tar.bz2 $ cd ./omnifs-0.3.0 $ ln -s /usr/include/fuse /usr/local/include/fuse $ ./configure $ make $ su -l # make install # ldconfig # cp sample.cfg ~ben/my-omnifs.cfg # chown ben.ben ~ben/my-omnifs.cfg # exit $ id -u -n ben $ cd ~ $ edit my-omnifs.cfg ... change login, password, api-key and api-private-key set omnifs-log-file = /home/ben/omnifs.log either comment out the proxy setting or set proxy settings to be valid ... $ mkdir ~/rawfs $ omnifs -c my-omnifs.cfg ~/rawfs
Building omnifs fails to find FUSE during configure unless I create the link in /usr/local.
To configure the FUSE filesystem, first log in to Omnidrive's Web interface (web.omnidrive.com), and note the API and API-private keys for use in the configuration file. After logging in, the keys are available by clicking the Settings button in the top right of the browser and then the API tab in the center of screen.
By default, the omnifs command runs in the foreground, so it blocks the terminal as long as the FUSE mountpoint is valid. After running the omnifs executable to mount the FUSE filesystem, the remote storage appears just like any filesystem:
$ cd ~/rawfs $ date >| foo.txt $ cat foo.txt Thu Aug 23 17:50:23 EDT 2007 $ ls -l total 0 drwx------ 0 ben ben 0 2007-08-31 03:15 Downloads -rwx------ 0 ben ben 29 2007-08-31 08:50 foo.txt
I found that omnifs occasionally can hang at “DEBUG: OMNI_ReadDir Called” in its log file. Restarting the omnifs executable usually helps get things going again.
Using SSH as the underlying transport for the FUSE filesystem limits usage to Internet connections that do not filter out non-Web traffic.
Given that you can use SSH directly with rsync, you might be wondering why bother with FUSE at all. Using SSH protects the transport of your information to the SSH server. Note that once the files you rsync to the server have been sent, they are not encrypted on the server's filesystem. If you don't have complete faith in the security of the SSH server, using sshfs to provide FUSE access lets you use the same cryptography discussed in the next section to protect your backups on the SSH server. Also, having all of your on-line storage accessible through FUSE lets you quickly change where you are storing an on-line backup without affecting the rest of the system.
In Fedora, sshfs already is packaged and can be installed with yum. Installation from source follows the standard configure path:
# yum install fuse-sshfs
Or:
$ ./configure && make; $ su -l # make install
Assuming you are using public keys on the server into which you are ssh-ing, starting to use sshfs is easy. As shown in Listing 1, I first add the server's key to my SSH agent before ssh-ing into the server and creating a directory to use for my on-line storage. I exit the connection and mount the SSH server to ~/rawfs and touch a file in a predictable way. The last command is ssh-ing into the server again to verify that the date has been added to a file in the on-line storage directory.
The mounting of sshfs can be tucked away into a script file, as shown in Listing 2. This can be convenient if you do not have a passphrase on the SSH key or if you do not always add (or want to add) that SSH key to your SSH agent.
Listing 1. Using sshfs to Mount an SSH Server
local$ ssh-agent bash local$ ssh-add .ssh/myserv ... local$ ssh myserv.example.com ex.com$ mkdir online-storage ex.com$ exit local$ sshfs \ ben@myserv.example.com::/home/ben/online-storage \ ~/rawfs -o idmap=user local$ date >| ~/rawfs/datefile1.txt local$ fusermount -u ~/rawfs local$ ssh myserv.example.com ex.com$ cat online-storage/*txt Fri Aug 24 17:16:40 EDT 2007
Listing 2. A Little Script to Mount Your sshfs
$ cat ~/bin/mount-sshfs-example.sh #!/usr/bin/ssh-agent bash ssh-add .ssh/myserv sshfs \ ben@myserv.example.com:/home/ben/online-storage \ ~/rawfs -o idmap=user
If you are running a 2.6.20 kernel or later, eCryptfs should be ready for use without any setup work. Running a 2.6.22 Fedora 7 updated kernel, I had major problems getting eCryptfs to work properly where the base filesystem was stored on a FUSE filesystem. When I did get eCryptfs to mount, there were errors with trying to use rsync to the eCryptfs filesystem, which finally resulted in a kernel oops. I have eCryptfs working fine using a local ext3 filesystem to store its encrypted data, so I suspect it is an issue with eCryptfs and FUSE interaction. Depending on which distribution you are running, setting up eCryptfs to allow nonroot users to mount an encrypted filesystem also can require some tinkering with PAM.
EncFS is a FUSE filesystem that takes a “raw” filesystem and presents a new filesystem. Any files created on the new filesystem will be encrypted and stored to the raw filesystem. EncFS requires FUSE, OpenSSL and rlog. The FUSE EncFS filesystem can be installed either from your distribution's package repository or manually, like this:
yum install fuse-encfs
Or:
tar xzvf rlog-1.3.7.tgz cd rlog-1.3.7 ./configure && make make install cd .. tar xzvf encfs-1.3.2-1.tgz cd encfs-1.3.2 ./configure && make make install
The first time you attempt to mount a raw filesystem to an encrypted filesystem, EncFS will ask you what level of cryptography you desire and what passphrase to use. The same command is used to create an encrypted filesystem and to mount one. Subsequent mounts of the raw filesystem with EncFS will prompt you only for the passphrase. Initial mounting and remounting of EncFS on a rawfs (backed at the time by sshfs) is shown here:
$ encfs ~/rawfs ~/backupfs Creating new encrypted volume. Please choose from one of the following options: enter "x" for expert configuration mode, enter "p" for pre-configured paranoia mode, anything else... will select standard mode. ?> Standard configuration selected. Configuration finished. The filesystem ... has the following properties: Filesystem cipher: "ssl/blowfish", version 2:1:1 Filename encoding: "nameio/block", version 3:0:1 Key Size: 160 bits Block Size: 512 bytes Each file contains 8 byte header with unique IV data Filenames encoded using IV chaining mode. Now you will need to enter a password ... You will need to remember this password, ... no recovery mechanism. However, the password can be changed later using encfsctl. New Encfs Password: Verify Encfs Password: $ date > backupfs/datetest.txt $ cat backupfs/datetest.txt Fri Aug 24 20:44:33 EDT 2007 $ ls -l rawfs total 4 -rw-rw---- 1 ben 505 37 2007-08-24 06:27 K9dmA... $ fusermount -u backupfs $ encfs ~/rawfs ~/backupfs EncFS Password: $ ls -l ~/backupfs -rw-rw---- 1 ben 505 29 2007-08-24 06:27 datetest.txt
We now have a ~/backupfs filesystem that encrypts anything written to it and stores it on an on-line storage system somewhere. A great tool for keeping your on-line backup up to date is rsync(1).
The rsync manual page states: “The rsync remote-update protocol allows rsync to transfer just the differences between two sets of files across the network connection.”
In our case, both the data to be backed up and the place to which we are backing up appear through the Linux kernel. Because ~/backupfs needs to read and write to the Internet, we very much would like to limit the amount of data that is written to it.
Some differences between a normal Linux kernel filesystem like ext3 and our layered setup might have to be worked around with command-line options to rsync. Listing 3 shows an rsync on an EncFS, which is using sshfs to provide the on-line storage. The first time rsync is run, the whole file is uploaded to the on-line storage. The second time, only some metadata is sent and received.
The -a option to rsync is similar to the -a option to the cp command; it attempts to preserve everything in the source filesystem at the destination. The --no-g command-line option to rsync tells it not to try to sync the destination file's group to the source file's group. In this case, the sshfs does not allow me to change the group of the destination file, so rsync would generate a warning when it failed to set the remote file's group. The --delete-after cleans up any files that exist only in the on-line storage filesystem. In this case, I also use --include to sync only the plain-text files. This can be quite handy for keeping backups of only OpenOffice.org documents in a larger filesystem.
Listing 3. Using rsync to Back Up Data to an Encrypted On-line Filesystem
$ rsync -av --delete-after \ --include="*.txt" --no-g \ small/ ~/backupfs ... boysw10.txt sent 49056 bytes received 48 bytes total size is 48923 $ rsync -av ... sent 83 bytes received 26 bytes total size is 48923
Another rsync option that can be invaluable is --modify-window=n, where the parameter n is the number of seconds that the two timestamps can differ between the local and remote files and still be considered the same. When using a filesystem showing on-line storage, the modification time might range from not being perfectly accurate to being a few days off. Setting the --modify-window correctly can hide these slight timestamp drifts or large fixed timestamp offsets and allow rsync to continue to work efficiently.
Running EncFS on top of OmniFS requires some special parameters when first mounting the EncFS. The main issue I found with using the default settings for EncFS was that file contents, when read back, would sometimes have trailing garbage. When using OmniFS and first creating the EncFS, choose expert mode, cipher=AES, keysize=256, blocksize=4096, filename encoding=Stream, filename IV chaining=Yes, per-file IV=no and block authentication code headers=no. The main issues seem to stem from the per file IV settings and something going missing with the round-trip latency of OmniFS. Listing 4 shows some combinations of expert mode settings to EncFS when using OmniFS as the base filesystem and the resulting filesystem interaction.
Listing 4. Some EncFS Options and Their Results When Using OmniFS to Mount the On-line Storage
x, 1, 256, 4096, 2, R, n, R == OK x, 1, 256, 4096, 1, n, R, R == BAD x, 1, 256, 4096, 3, n, n, R == OK x, 1, 256, 4096, 3, R, n, R == OK
Some filesystem people dislike FUSE because of the extra context switches it can introduce. The use of two FUSE filesystems layered on top of each other, as shown in this article, means there is quite a bit of context switching going on in order actually to get data to the network. For the purposes of this article, the overhead of these context switches is irrelevant when compared to Internet connection speed.
Encrypting your home directory can give peace of mind in the event that your laptop is stolen. With on-line backups, you also are protected against losing your important changes along with your laptop or its crashing hard disk.
By using FUSE to expose the on-line storage as a filesystem, the encryption and synchronization can be left intact when you decide to change your on-line storage provider. The OmniFS filesystem uses HTTP to communicate with the on-line storage provider, so it should work even when your Internet connection has aggressive packet filtering.
Resources
“eCryptfs: a Stacked Cryptographic Filesystem” by Mike Halcrow, LJ, April 2007: www.linuxjournal.com/article/9400
Mounting eCryptfs as a Nonroot User: ecryptfs.sourceforge.net/ecryptfs-faq.html#nonroot
Openomy Storage Service: www.openomy.com
OpenomyFS: FUSE Filesystem for Openomy: mauricecodik.com/projects/ofs
GMailFS, Mount Your Gmail Account: richard.jones.name/google-hacks/gmail-filesystem/gmail-filesystem.html
FUSE: Filesystem in Userspace: fuse.sourceforge.net
Ruby FUSE Bindings: rubyforge.org/projects/fusefs
Create a Filesystem inside a Berkeley DB File: www.kernel.org/pub/linux/kernel/people/jgarzik/fs
Omnidrive, Free On-line Storage: www.omnidrive.com
OmniFS Home: users.tpg.com.au/panyam/omnifs.html
FUSE Filesytem for Mounting SSH: fuse.sourceforge.net/sshfs.html
EncFS FUSE Filesystem Home Page: arg0.net/wiki/encfs
Ben Martin has been working on filesystems for more than ten years. He is currently working toward a PhD combining Semantic Filesystems with Formal Concept Analysis to improve human-filesystem interaction.