Early LUKS: Linux suspend to disk (suspend2) and LUKS partitions

© 2006, Jens Gustedt, INRIA, France $Revision: 1962 $


Concept
Why?
How?
Preparing
Busybox
LUKS
ramdisk
boot
tune
layout
initrd
stick
modules
read-only
no root
all crypted
initscript
filelist

This page contains information that proposes to change the way your system handles your most sensible belongings, your data. Please have the following phrase of GPL in mind:

This software is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

In particular, what is proposed here, applied right or wrong, may be ending by eating up your data.
You have been warned.


This describes a simple and robust setup do get encryption with LUKS and suspend2 going. Simple and robust in the sense that we leave as much as possible to the kernel. Our boot script will deviate from your usual boot procedure as few as possible. When booting normally and going to your init process this should find a system as it expects it. In addition to the usual physical disk partitions that your computer has, it should just be seeing some other filesystems that in fact are living on encrypted physical partitions. Otherwise nothing fancy, just doing it.

Please, don't think that encrypting your disk partitions will make your system safe. It will just make it safer than before, in particular when the system is off. If somebody manages to break into your running system, it is as vulnerable as before. If somebody has large amounts of computing power to break codes (or is perhaps just clever enough) he might be able to decrypt your partitions.

Why?

A laptop is a very volatile device, it easily gets stolen, you might simply lose it or it might get sneaked into when you are not aware of it. Such a case is easily compromising your private or professional data on your hard-disk. So you need some sort of encryption of all sensible data on your disk.

Laptops are working tools that you carry around. Currently, they have one fundamental technological weakness which is their power supply. To save battery power, you want to switch your laptop off when your are not using it and then you want your laptop to be immediately responsive when you come back to it. This is what suspend to disk is for.

Unfortunately these two software components do not work peacefully side-by-side without putting them up carefully. If you are doing suspend to disk into an unencrypted partition or file, all the main memory of your computer is exposed, including your passwords that some applications might cache somewhere. If you are suspending to something encrypted, your system has to somehow get the keys to read that data when resuming.

How?

There are several solutions around in the linux world for both of these task. If you are not using LUKS for your encrypted partitions and a technique different from suspend2 for suspending, all what is said here is probably not for you.

For our implementation of early LUKS with suspend to disk we create a so-called ramdisk that is statically linked to the linux kernel. Therefor we need five different components:

A computer
Sure. All here is tested on a machine with the `i386' architecture, that is a conventional PC. But there should be nothing special concerning the architecture, others for which there is a stable port of GNU/Linux (including suspend2) should work as well, drop me a line if you succeed with such a thing. The computer should be installed with a GNU/Linux system, any decent and recent distribution should do. Important feature that it should have/use: udev to create device nodes on the fly.
linux
I don't have to introduce this, but you need recent kernel sources. At the time of writing this, I am using 2.6.17.4.
suspend2
Nigel Cunningham's performant and stable implementation of suspend to disk for linux, my version is 2.2.7.3. To start, chose a version that is labeled as `stable'. You will also need the hibernating scripts available via the suspend2 site.
LUKS
Clemens Fruhwirth's nice implementation of hard-disk encryption, I have version 1.0.3.
busybox
The Swiss Army Knife of Embedded Linux, version 1.1.3.

Such a ramdisk is some sort of virtual disk that the kernel sees in an early booting phase. Think of it as an egg: your kernel executable is the embryo, the ramdisk is the yolk. All of this is protected by the eggshell from real life. The break through to reality is only possible when the chick, our kernel, is strong enough to pick a hole in the shell. In terms of kernel this means, when the kernel knows how to access peripheral devices, in particular the hard-disk.

Under usual circumstances the kernel should have enough knowledge from the start that in may access to all important devices that are physically linked to your machine, keyboard, screen and disks. But when doing encryption, you need to provide the passphrase for the encrypted devices before the kernel can start to do anything sensible with them. The ramdisk that is proposed here will just serve to that, get the kernel over that critical moment.

Preparing your system

If you have a statically linked kernel (with at least all drivers for your hard disk) and do no encryption, you would (and should) not need a ramdisk. A good test for your setting and for the required knowledge that this setup requires is that you be able to create a kernel for your system that needs neither ramdisk nor its old uncle initrd. So before you start tempering with encryption:

Backup your system.
Keep a working kernel configuration at a safe place. That is some kernel that is renamed to something unique, e.g /boot/vmlinuzForEver. Convince your bootloader to boot that rescue system. If you have a decent linux distribution, there is perhaps already such a thing, namely just their standard kernel that came with it.
Backup your system.
Install the kernel sources and configure a static kernel, that is a kernel where all that is needed in early boot phases is statically linked. Especially important are your hardware busses (pci, scsi, usb,...), your disk drivers and file systems. Install it such that it uses neither ramdisk nor initrd. On some distributions `make install' automatically creates an initrd for the new kernel. Your kernel must be able to boot without it.
Here you should also include dm-crypt statically into your kernel and all parts that are needed by it, as the crypto algorithm that you will use, most probably AES. You will need that later on. But, don't install a crypto partition, yet.
Backup your system.
Apply the suspend2 patches to your kernel source tree. Configure it. Compile. Install. Get it going such that it suspends to your swap partition and hibernate several times. In this step you should also take care that everything suspend2 needs is linked statically into the kernel, in particular the compression modules.
Backup your system.
Now you might also try to separate boot and root, that is put /boot on a partition of its own. With all you know now, this should not be too difficult, but you'd better not forget to tell your boot loader (lilo, grub) that you did so. This step is only necessary if you want to have an encrypted root partition later on. I personally don't have this, since I don't think that my root partition contains really sensible data. In particular my /home, /usr and /var are different partitions. Please, drop me a line if you succeed with installing an encrypted root partition.
Backup your system.
In any case you will need a `spare' partition later on where you will put your first encrypted device, let us call it /dev/hdaXX. If you don't have any, you could use your swap partition. Therefor switch off swap
swapoff -a
and comment the corresponding line in /etc/fstab. Be careful with that partition. Remember its name well. All data on that partition will be lost.

Do not go beyond this line
if you did not succeed in these steps or if you don't really understood what this was all talking about.


Busybox

This is the executable that plays the role of basically the whole system tools that you are used to under a GNU/Linux system. Its primary feature for us is that it is a compact standalone tool that provides all features that an early system needs.

Most probably your distribution comes with a version of busybox, but also most probably this one will not exactly be what you need. So you will have to compile your own one. Think of this as compiling a second kernel part, the yolk to the egg. It is as simple as kernel compilation: ``make menuconfig'', so nothing to be scared of.

The general features that you want to have are, of course, STATIC, but also ``--install'' support, we will need that to install links dynamically within the ramdisk.

Get an early shell

The first task for your busybox will be to be a shell, called via the name /bin/sh. This shell should be sufficiently rich, close to what you are used to. To be able to use our initscript you must use ash. So this is the default shell of your choice.

For this ash we need some special features: alias and the ``-s'' option to the read-builtin.

Other executables

Once the initscript is launched we will need the executables that it launches. These are also mimicked by busybox, you just have to compile them in. The following are mandatory, without them your system will get stuck:

(u)mount
Used to mount /proc, /sys and eventually your root partition. Used to unmount /proc and /sys from the ramdisk when we are done.
mdev
Trigger the installation of device nodes in /dev. Uses udev to do so.
ln
Used to link the control device /dev/mapper/control.
echo
We need to be able to do simple output. Also give it the ``-n'' option, we need it.
sed
Do some simple comandline parsing.
test
Perform file system inquiries.
logger
We will log all early events as soon as possible.
rm
Do some cleanup.
switch_root
Change `/' to the newly mounted root partition. switch_root is very picky. It only works as init process (process number 1), so we have to exec into it.

In addition you certainly want to have other executables inside busybox, such as ls, mkdir or other system tools that will help you to rescue your system if something goes wrong. Read the documentation of busybox if you want to know more. As a last resort you could just try to compile everything in. Better check for the tools that are listed here.

LUKS

LUKS uses the kernel's device mapper features to map a decrypted picture of a physical block device to a virtual block device. Every tool that knows how to handle block devices then will know how to use the decrypted device, once this mapping is established.

So first of all, you need the static binary that maps your physical (encrypted) disk partitions to virtual (un-encrypted) block devices: cryptsetup. Get it fresh from the LUKS site. The one that comes with your distro might not yet have LUKS support in it.

There is also a tool named dmsetup which may come handy if you want information on the mapped devices. But usually you would not need it.

For simplicity we will always assume that a `real' device node /dev/XXX will be mapped to the `virtual' device /dev/mapper/XXX, for example my swap partition corresponds to device /dev/hda5 and is visible as a block device with a living swap file system on /dev/mapper/hda5.

Since you compiled dmcrypt and all the encryption code that it needs statically into the kernel, the static executable that you find on the LUKS page is all you need.

Remember the spare partition /dev/hdaXX that you freed above. We will now encrypt it, and for all of the following you will need to have root privileges.

cryptsetup luksFormat /dev/hdaXX hdaXX
cryptsetup luksDump /dev/hdaXX
ls -la /dev/mapper
    

`luksFormat' did ask you for a passphrase for that partition. LUKS is capable to use several such passphrases for a partition. I think it is a good idea to use at least two of them, one your root password and one your personal passphrase. Add a second passphrase to the partition:

cryptsetup luksAddKey /dev/hdaXX
    

You will use several LUKS partitions later, it will be convenient that they all have one passphrase in common.

Now you may create a swap file system on that virtual partition:

[root@annot early]# mkswap -L SWAP /dev/mapper/hdaXX
Swapbereich Version 1 wird angelegt, Größe 4293042 KBytes
LABEL=SWAP, UUID=3aa99677-5166-4f4f-9fd2-f7def9b379b8
[root@annot /]# /etc/early/busybox findfs LABEL=SWAP
/dev/mapper/hdaXX
[root@annot /]# swapon /dev/mapper/hdaXX
[root@annot /]# swapon -s
Filename                    Type      Size    Used Priority
/dev/mapper/hdaXX            partition 4192420 0    -1
    

If all of this worked flawlessly, change your /etc/fstab to reflect that change, instead of putting /dev/mapper/hdaXX you could put LABEL=SWAP. Try again.

[root@annot early]# swapon -s
Filename                                Type            Size    Used    Priority
/dev/mapper/hdaXX                       partition       4192420 0       -2
[root@annot early]# swapoff -a
[root@annot early]# swapon -s
[root@annot early]# swapon -a
[root@annot early]# swapon -s
Filename                                Type            Size    Used    Priority
/dev/dm-0                               partition       4192420 0       -3
    

You see, since I used the LABEL=SWAP the system chooses another device node for that same block device. This depends a lot on your udev settings, using a label just has the advantage to make you independent from that.

Note: For partition labels to work properly, it might be necessary that you ensure that not only the `named' devices in /dev/mapper are present. The utilities mount and swapon may need to also have device nodes of the form /dev/dm-N, where N is the minor device number of the device. Some distribution's udev may not create these devices.

Don't create an encrypted partition with sensible data, yet. If anything goes wrong during boot with a swap partition, the system will simply not use it. If something goes wrong with a data partition, you will loose it.

Create the ramdisk

A ramdisk for booting is simply created during the kernel installation phase, there is not much more to it than to configure the location where everything is found. I am using a directory /etc/early for that. You should copy some files in that directory:

filelist.txt
Copy the file that is given here. Read it. Adapt it if necessary.
initscript
Copy the file that is given here. Read it. Adapt it if necessary.
busybox
This is the statically linked executable that you created above and that has all the features that initscript needs build into it.

Then you should give the ramdisk a /etc/fstab with the virtual filesystems that you want to mount early. This could be just a copy of the file as it is for your system, but then you risk to have some of these mount points hidden behind your root device. For a first test just put the following two lines into /etc/early/fstab:

proc /proc proc defaults 0 0
sysfs /sys sysfs rw 0 0
    

Optionally you may also put the static version of cryptsetup here and/or adapt your filelist to reflect that choice. As it is this executable would just be copied from is usual location into the ramdisk.

Tell the kernel configuration where your filelist is found. There is a configuration option for that. Then:

make bzImage
    

Copy the resulting kernel to your boot directory, usually /boot under a unique name /boot/vmlinuzLUKS, say. Create an entry in your /etc/lilo.conf (or equivalent), name it LUKS. The command line should not contain a suspend2= argument for the moment.

It may contain an argument root=/dev/yourrootparttion. This may be different from the corresponding lilo assignment. Put something additionally in the append= section if the initscript tells you that it can't find the root partition.

Rerun lilo (or equivalent).

Boot your new configuration

Enable your ramdisk

Best is that you start slowly. Boot into the new kernel in single user mode. Give

LUKS 1
    

to the lilo boot prompt. If something goes wrong here, reboot into your normal system, repair the fault and retry. Wait, there is one thing that must go wrong here, there will be no swap, since /dev/mapper/hdaXX simply does not exist.

Enable LUKS

Tell initscript about your LUKS partition. Re-boot with

LUKS luks=hdaXX 1
    

Now you should be asked a passphrase and the device should be mapped. If something goes wrong here, maybe you have the wrong cryptsetup or you forgot to add the `-s' option for read in the busybox shell executable.

If you successfully booted into that you should have your swap partition back.

Enable suspend2

Tell initscript about your suspend2 device. Reboot with

LUKS luks=hdaXX resume2=swap:/dev/mapper/hdaXX 1
    

Now during boot you should see a message from suspend2 that it found a valid swap signature and is ready to use it.

hibernate several times. If something goes wrong when resuming you may use

LUKS luks=hdaXX noresume2
    

to reboot normally. But be sure to run fsck on all the filesystems that you previously had open when suspending.

If everything works fine, add the options luks=hdaXX resume2=swap:/dev/mapper/hdaXX to your LUKS command line in /etc/lilo.conf or similar and rerun lilo.

Boot normally into your LUKS kernel.

Tune your system

Congratulations, you have a working system with encrypted swap and suspend2. Now with all the experience that you have you will easily set up other LUKS partitions. There are a lot of web pages and wikis around that can help you on that. Keep in mind that setting up an encrypted partition destroys all the data on that partition, so be careful.

Partition layout

Whenever you create a new LUKS partition, just add the name to the luks commandline parameter for your kernel. It is a colon separated list of such names, with or without the /dev/ prefix. You might also put these block devices /dev/mapper/something into your /etc/fstab. If you are using filesystem labels this should also work. To mark a mapped LUKS partition that has a ext2 or ext3 filesystem on it with some label (here `TMP') do something like

e2label /dev/mapper/hdaYY TMP
    

Similar things should work for vfat partitions and swap (type 2).

Using partition labels (or UUIDs) helps to avoid errors that may occur when devices are attached in some random order. This may e.g be the case for usb disks, other hotplugable hardware or when you invert the listing of the LUKS partitions in the luks= commandline option. Nowadays device numbers are not so much fixed any more, you should try to identify your devices with something that is unique and reproducible in your setting. My /etc/fstab looks similar to the following:

proc /proc proc defaults 0 0
usbfs /proc/bus/usb usbfs rw 0 0
sysfs /sys sysfs rw 0 0
tmpfs /dev/shm tmpfs rw 0 0
devpts /dev/pts devpts mode=620 0 0
LABEL=SWAP swap swap defaults 0 0
LABEL=ROOT /mnt/part/ROOT auto noauto 0 0
LABEL=USR /usr auto noatime 1 2
LABEL=VAR /var auto noatime 1 2
LABEL=SRC /mnt/part/SRC auto noatime 1 2
LABEL=HOME /home auto defaults 1 2
LABEL=TMP /tmp auto noatime 1 2
    

To use this feature you have to enable findfs in your busybox.

Also keep in mind that there is only relative security, not absolute. If you just want to be sure that nobody with a normal knowledge about linux will be able to get sensitive information out of your hard disk you need three different LUKS devices:

swap
suspend2 uses swap (or maybe a file) to dump all the kernel data before suspending. This space is only read on resume, not rewritten: reading is usually much faster and you want to have your system back ASAP. So the information remains on the hard disk until it is eventually overwritten.
tmp
Many tools write sensible data, hopefully with restricted access rights to /tmp. This is perhaps ok as long as your OS is protecting from unauthorized access. But when the machine is off and your hard disk is only some screw drives away, you'd better have that information encrypted. Another option would be to have a large swap and mount some tmpfs over /tmp.
user files
You may just encrypt /home or give some users an extra encrypted partition to save their sensible data.

initramfs or initrd

The way we described the installation above was with the so-called initramfs, that is a virtual filesystem that is statically linked to the kernel. This is probably the easiest way to get things going.

This approach has the disadvantage that with every new kernel you have a new copy of all this filesystem hidden inside your vmlinuz executable sitting on your harddisk. initrd is an alternative to have this separated. Here you find a small Makefile that creates earlyrd.cpio (in the local directory) and /boot/earlyrd.img. Your bootloader (lilo, grub or equivalent) will then be able to pass this image file as a contents for the first rootfs to the kernel. You have to compile your kernel with static ramdisk and initrd support enabled.

If you like such things, you could include some kernel modules in that initrd, such as usb or bootsplash. You should put here what you need temporarily during boot and that you typically want to unload afterwards. Usually there is no point in have a module here that stays, it should be static anyhow.

Passphrase on a memory stick

The script now supports an option to provide the LUKS passphrase via a memory stick, floppy or similar. You have to give an option of the form lukskey=partition:keyfile. Here partition can again be anything that identifiable as block device itself, a combined major/minor device number, a filesystem label or uuid. This partition is only mounted very shortly read-only to fetch the passphrase from `keyfile'. `Keyfile' must be a filename relative to the indicated partition.

Obviously, you need all the kernel drivers for the device on which this partition lives at that point. This includes eventual usb busses, the usbfs filesystem, and probably also scsi. Typically such devices are slow, they need some seconds to be fully attached and to be visible to the system. Add an option rootdelay=N for some number N to the boot parameters if you see the device detection pop up just after the script decided to ask you for the passphrase instead.

Be careful when using such an external volatile device for your passphrase. First of all it is probably not a good idea to loose it... But you may also observe some technical considerations:

Loading kernel modules

If you like to have some more eyecandy and for some reason or another you cannot load the corresponding features statically in your kernel, you can try to get them from your harddisk early. As soon as you get such a thing for outside of your real kernel, to my taste it makes not much sense to have complicated mechanisms to load them into your initrd. Usually an unencrypted harddisk partition is present then, so better put the modules you need there. With the parameter modules=DEVICE[:path] you may have mounted that partition briefly quite early to catch some modules from path, which is of course relative to that partition.

The mount is read-only, so it shouldn't do harm to your potential suspend2 image if it was present read-only, too. But, be sure that it is not mounted read-write before suspending.

Remounting filesystems read-only

All filesystems that are mounted read-write while suspending are in danger as soon as you try to mount them even read-only before resuming. This concerns in particular devices that contain encryption keys, modules or even your boot partition if you use the lilo trick that is offered by the hibernate script.

You could try to achieve a read-only remount with the hibernate script by adding something like

OnSuspend 97 mount /MNTPNT -n -o remount,ro
OnResume 97 mount /MNTPNT -n -o remount,rw

to common.conf. Here /MNTPNT stands for the directory path to which your device usually is mounted. The 97 means to do this very late in the suspend process and very early in the resume process. So all processes or services that rely on writing to that device can proceed as usual. The -n option hinders writing to /etc/mtab.

No root partition?

The initscript as it is given here allows to be flexible and have as many encrypted partitions as you like. In particular, it allows to have no root partition at all, at least in the classical sense that there is no partition that is backed by some permanent physical storage device. With the early-option norootdev, on top of the builtin root device of type rootfs it builds a virtual filesystem of type tmpfs that is nothing but a clever mapping of kernel pages. All other block devices are mounted `equally', none of them is a distinguished root device. If you want to use this feature, it is very likely that you have to adapt the boot scripts of your distro a bit. Be careful.

All encrypted?

I have no personal experience with that, but if you want more, you may have all your partitions encrypted. But wait, you have to boot from somewhere. So either you leave a small /boot partition unencrypted or you put your kernel with early LUKS on a floppy, usb-stick or similar. In the first case you are not protected against a `man-in-the-middle' attack: someone might be modifying your /boot partition while you are not looking and sniffing your LUKS passphrase for example. The second case is safer if you always keep that boot floppy safely with you and you never sleep, or if you implant that thing in one of your teeth or under your skull. In either case, I will certainly neither be able nor willing to help you to recover if you screw your system up.