Lead Image © Dmitry Pichugin, Fotolia.com

Lead Image © Dmitry Pichugin, Fotolia.com

Backups using rdiff-backup and rsnapshot

Brothers

Article from ADMIN 31/2016
By
The easier you can back up and restore data, the better. Mature Linux tools show that performing regular, automated backups doesn't have to be a pain.

The first step in ensuring comprehensive backups is to consider where the backups should be stored; therefore, a separate backup server is often used that connects to other computers and initiates the backups. Alarm bells will be ringing for security-conscious administrators at this point – the backup server can connect to all the other machines! Safeguarding the backup server and its connection scheme is therefore extremely important, not least because the productive data for all systems are on the backup server.

Automated backups in Linux usually require a user who connects to the system to be backed up using public key authentication. Two security aspects are critical: First, the user needs root rights for the target system to be able to back up all the data, and, second, the private SSH keys for automation are not password protected. In this article, I provide a detailed set of instructions for how to counteract these weak points using the following simple restrictions:

  • Create a separate key pair for the backup user and limit the permitted commands to the systems to be backed up using authorized_keys.
  • Create a sudo configuration for the backup user that only allows the backup program (rdiff-backup or rsnapshot) to dispense with a password entry.

rdiff-backup vs. rsnapshot

The two command-line tools rdiff-backup and rsnapshot are well-known backup programs in Linux. After initial configuration, their simplicity and reliability are very impressive. Table 1 shows the most important functions for both tools and provides some initial information about backup concepts.

Table 1

rdiff-backup and rsnapshot Differences

  rdiff-backup rsnapshot
Programming language Mainly Python Completely in Perl
Data transfer Uses librsync Uses rsync
Data storage Old versions are saved as increments or deltas to the current version. Files that don't change are stored as hard links using snapshots.
Data access The last data version (mirror) can be accessed immediately; older versions can be restored via increments. All snapshot data can be accessed immediately.
Removing backups Backups can be removed using --remove-older-than (i.e., versions that are older than a certain time). Backups run at certain intervals (e.g., daily or weekly); retain controls which type of snapshot is retained for how long.

Rdiff-backup, as the name suggests, saves the delta between current data and an old version as a reverse diff. If a file changes, only the changes to the previous version are stored in a backup. The current data version or mirror can then be used straightaway. Older versions are computed from the diffs.

Rsnapshot takes another path: If a file doesn't change more than two snapshots, it simply creates another hard link to the file. Identical files then don't take up any more space than needed. As with rdiff-backup, there is no diff calculation. If a file changes, it is completely available in the next snapshot.

Data Backup Using rdiff-backup

Backups using rdiff-backup are created based on the source and target directory. The following examples backups the /etc directory to /mnt/backup:

# rdiff-backup /etc /mnt/backup
# ls /mnt/backup/hosts
/mnt/backup/hosts

Forward slashes at the end of directory names (trailing slashes) are ignored, so it doesn't matter whether you use them here or not. However, in rsnapshot, you have to use trailing slashes in the rsnapshot.conf file. The example above also shows that the files are located directly below /mnt/backup/: /etc/hosts was backed up to /mnt/backup/hosts. You need to sort out subdirectories yourself.

Rdiff-backup does not provide a progress bar during the backup, but verbosity levels are there for anyone who wants to know what is being backed up at the time. Level 5 displays whether a file is changed; however, each processed file is listed in level 6:

# rdiff-backup -v5 /etc/ /mnt/backup
[...]
Incrementing mirror file /mnt/backup
Processing changed file X11
Incrementing mirror file /mnt/backup/X11
Processing changed file X11/Xreset
[...]

The --compare function is also very useful; it performs a kind of trial run and lists the files that have changed. In this way, you know in advance about data that would have been backed up:

# rdiff-backup --compare /etc/ /mnt/backup
changed: .
 changed: hosts
 changed: mtab

To perform another backup you just need to execute the same command again. Continuous backups have the advantage of allowing you to access different versions of data by backup time (Listing 1). The --list-increments option displays how many backups are available at what times. The current version is listed in the Current mirror line, and the data for these times can be accessed as normal files.

Listing 1

Incremental Backups

# rdiff-backup /etc /mnt/backup
# rdiff-backup --list-increments /mnt/backup/
Found 2 increments:
   increments.2015-03-15T09:15: 19+01:00.dir Sun Mar 15 09:15:19 2015
   increments.2015-03-19T20:15: 46+01:00.dir Thu Mar 19 20:15:46 2015
Current mirror: Sat Mar 21 08:43:49 2015

Metadata and increments or diffs are in the rdiff-backup-data directory. It is at least as important as the remaining backup data. After all, the increments are responsible for letting you restore data from previous backups. You thus also need to think about backing up your backups. If the backup system bites the dust, or something goes wrong with rdiff-backup, this mustn't become a huge problem.

Excluding files

Excluding files (excludes) from a backup is just as much an advantage as being able to include them. The simplest way is to pass in the files to be excluded to rdiff-backup using --exclude:

# rdiff-backup --exclude /etc/ld.so.cache /etc /mnt/backup
# ls /mnt/backup/ld.so.cache
ls: cannot access /mnt/backup/ld.so.cache: No such file or directory

This is just the easiest approach to excluding. Shell patterns, regular expressions, and exclude lists are also supported. An exclude list initially consists of the paths for the files to be excluded:

# cat exclude-list
/etc/wpa_supplicant
/etc/dump

This list then serves as a parameter for the --exclude-filelist,

# rdiff-backup --exclude-filelist exclude-list /etc/ /mnt/backup

or --exclude-globbing-filelist options. Globbing lists allow the use of patterns [1].

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Using rsync for Backups

    Although commercial Linux backup tools are available, many people prefer open source to better understand and control the backup process. One open source tool that can do both full and incremental backups is rsync.

  • Encrypted backup with Duplicity
    The free Duplicity backup program consistently encrypts all backups, which means that backups can even be stored in an insecure cloud.
  • Redo Backup
    Redo Backup backs up complete hard drives locally or over a network. The focus is on simple operation and high reliability in a variety of deployment scenarios.
  • How to back up in the cloud
    In cloud computing practice, backups are important in several ways: Customers want to secure their data, and vendors want to secure the essential details of their platforms. Rescue yourself, if you can.
  • Cloud protection with Windows Azure Backup
    Microsoft offers the Windows Azure Backup service, which lets you back up data from servers in the cloud. This removes the need for your own infrastructure, and the service alleviates privacy concerns by using continuous encryption.
comments powered by Disqus