Lead Image © Valentin Volkov, 123RF.com

Lead Image © Valentin Volkov, 123RF.com

Save and Restore Linux Processes with CRIU

On Ice

Article from ADMIN 22/2014
By
With CRIU, you can freeze the current state of a process and save it, then bring it back to life and continue from the point at which it was frozen.

Some maintenance work can only be done when no production software is running, but admins cannot terminate processes at will; otherwise, they risk losing data and time-consuming computations that will have to restart from scratch. The remedy on Linux systems is a small tool named "Checkpoint/Restore In Userspace," or CRIU.

CRIU freezes the current state of a process and saves it on the hard disk. Later, you can bring the process back to life; it then continues working at the point where CRIU froze it. For virtual machines, this is known as creating snapshots.

Freezing is not useful just to allow maintenance. Suspended processes can be moved to other computers and continue to run there. This live migration helps, for example, in load balancing scenarios: If a computer is just twiddling its thumbs, you can use a script to transfer a process to it. You can also freeze suspicious processes and analyze them on another system at your leisure. If you integrate CRIU in your system's startup scripts, it backs up processes automatically when you shut down and brings them back up on the next boot. In this way, you not only save the state before switching off but also shorten the boot process.

Numbers

Development is proceeding quickly. The first CRIU version appeared less than two years ago. When I first started writing this article, only the first release candidate of the 1.1 version was at available; v1.1-rc2 and v1.1 followed soon after. Two more versions followed quickly, and when this article went to press, v1.3-rc2 was the most recent release. The following comments are based on the v1.1 release candidate. Other versions are available in the release archive [1].

CRIU indeed works completely in userspace, but it imposes several requirements on the running system. First, the program only works on systems with ARM or x86_64 architecture; in the latter case, you must be running 64-bit Linux. Version 1.3-rc1 added AArch64. Furthermore, CRIU requires Linux kernel version 3.11 or greater. Current desktop distributions satisfy this condition, but popular Linux distributions on servers, Debian 7, and CentOS 6.5 do not. The use of CRIU on these systems thus would require a kernel upgrade.

The running kernel must also provide the information required by the CRIU functions. Table 1 lists the settings you need to enable when you compile the kernel. CRIU checks the kernel's compatibility with criu check.

Table 1

Required Kernel Functions

Variables Enable in the Configuration Menu
CONFIG_EMBEDDED General setup | Embedded system
CONFIG_EXPERT General setup | Configure standard kernel features (expert users)
CONFIG_EVENTFD General setup | Configure standard kernel features (expert users) | Enable eventfd() system call
CONFIG_EPOLL General setup | Configure standard kernel features (expert users) | Enable eventpoll support
CONFIG_CHECKPOINT_RESTORE General setup | Checkpoint/restore support
CONFIG_NAMESPACES General setup | Namespaces support
CONFIG_PID_NS General setup | Namespaces support | PID Namespaces
CONFIG_FHANDLE General setup | Open by fhandle syscalls
CONFIG_INOTIFY_USER File systems | Inotify support for userspace
CONFIG_IA32_EMULATION Executable file formats | Emulations | IA32 Emulation
CONFIG_UNIX_DIAG Networking support | Networking options | Unix domain sockets | UNIX: socket monitoring interface
CONFIG_INET_DIAG Networking support | Networking options | TCP/IP networking | INET: socket monitoring interface
CONFIG_INET_UDP_DIAG Networking support | Networking options | TCP/IP networking | INET: socket monitoring interface | UDP: socket monitoring interface
CONFIG_PACKET_DIAG Networking support | Networking options | Packet socket | Packet: sockets monitoring interface
CONFIG_NETLINK_DIAG Networking support | Networking options | NETLINK: socket monitoring interface
CONFIG_MEM_SOFT_DIRTY Processor type and features | Track memory changes

Because of the detailed requirements for the operating system kernel, the CRIU developers previously provided a suitable kernel. Since kernel 3.11, however, Linux possesses all the features necessary, so the old CRIU kernel is no longer needed and no longer recommended.

If the kernel meets all the requirements, you also need Google's Protocol Buffers library [2] [3], which is available in the repositories of most distributions. Besides the library itself, you need the corresponding development packages, the C bindings and the Protobuf C compiler. On Ubuntu and Debian the appropriate packages go by the names libprotobuf-c0-dev and protobuf-c-compiler; look out for similar names on other distributions.

CRIU also relies on iproute2 – at least version 3.5.0 from August 2012. This tool is also on board with most recent distributions. If not, as is the case for Debian 7, you will find the source code online [4].

Test Run

To build CRIU, you need the sources [5], the Make tool, and a C compiler. After unpacking the archive and compiling with make, a system-wide installation of the tool is neither intended nor necessary.

Before you deep freeze the first processes, a CRIU test is recommended. To do this, run this command as the root user:

criu check --ms

When done, CRIU should output Looks good (see Figure 1). Otherwise, the tool tells you which function is missing. Older versions of the tool still went by the name of crtools. Therefore, some instructions still circulating on the Internet refer to this command name.

Figure 1: If CRIU reports "Looks good," the kernel provides all the features required by the tool.

The next test step takes place in the CRIU test subdirectory. Call the zdtm.sh script as root to start a test suite that starts multiple processes and freezes them for test purposes. A complete cycle takes a few minutes, during which the system can freeze repeatedly. If a problem occurs, the test suite aborts and tells you the root cause. After a successful run, you will only see the results of the last test (Figure 2).

Figure 2: The test suite checks to see whether the system meets all requirements for running CRIU.

Sandman

To freeze a process after successful tests, CRIU requires only the process ID and location. The following command backs up the process with the PID of 2238 in the checkpoint subdirectory below the user's home folder:

criu dump --images-dir ~/checkpoint --tree 2238

The criu command is always followed by the action to be executed – in this case, it creates a backup, or dump image, if you prefer. --images-dir (or -D) is the directory and --tree (or -t) the process ID. CRIU requires root privileges for all actions.

Freezing fails, however, if the process to be stored shares resources with the parent or any other process. In this case, CRIU cancels the action just to be on the safe side. For processes started from a shell, such resource sharing often cannot be avoided. You have three options here: Move a process into the background, start it in a separate session, or pass CRIU the additional --shell-job parameter (Figure 3):

Figure 3: The Top process, with ID 2238, can be only dumped with the --shell-job parameter.
criu dump -D ~/checkpoint -t 2238 --shell-job

In the target directory (e.g., ~/checkpoint) CRIU creates several files for a backed-up process. Each file contains the state of a resource used by the process (Figure 4). CRIU overwrites existing files without warning.

Figure 4: CRIU provides insight into the contents of a stored process.

After the process is backed up, CRIU then terminates it. The latest output is sent to the terminal, unformatted as shown in Figure 5, if necessary. The CRIU --leave-running parameter ensures that CRIU can continue to run the stored process.

Figure 5: The Top process backed up by CRIU leaves its last output in the terminal.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Analyzing Kernel Crash Dumps

    If the Linux server crashes, not only do you need to restore operations, you also need to analyze the problem. A kernel crash dump at the time of the crash can be a big help.

  • Live snapshots with Virtual Machine Manager
    In the scope of developing Fedora 20, the live snapshot function, which has long been supported by libvirt, was integrated with the graphical front end. If you prefer to avoid command-line acrobatics à la Virsh, you can now freeze your virtual KVM and Xen machines in VMM at the press of a button.
  • PostgreSQL 9.3

    The new PostgreSQL 9.3 release introduces several speed and usability improvements, as well as SQL standards compliance.

  • New in PostgreSQL 9.3
    The new PostgreSQL 9.3 release introduces several speed and usability improvements, as well as SQL standards compliance.
  • Maintaining Android in the enterprise
    No matter how insecure Android might appear, you can't escape the "bring your own device" philosophy in today's corporate environment. In this article, we show how admins can use on-board tools in Android phones to regain a little control.
comments powered by Disqus