New features in the Bareos Bacula fork
Better Backups
For years, an open source version of Bacula has been a popular solution for managing "backup, recovery, and verification of computer data" [1] on a network of diverse computers, operating systems, and storage media. Using the client-server model, Bacula scales from single computers to enterprise installations of hundreds of entities.
The open source version of Bacula was first published in 2002 and quickly found support in the community. Recently, less and less work has been put into the free Bacula, and new commits into the public Git project now occur only once every few months, with the developers seemingly focusing on the commercial Bacula Enterprise Edition, which is not publicly developed.
In 2010, long-standing Bacula developer Marco van Wieringen thus started to maintain enhancements and code cleanups that either were not accepted or were only proposed for integration into the commercial version in a separate Git repository. From this seed grew the decision by some former members of the Bacula community to continue development of an independent fork named Bareos.
The first stable release was Bareos 12.4 in April 2013 (the version number stands for the year and the quarter of the feature freeze). The current beta is version 13.2. On September 25, 2013, at the Open Source Backup Conference, formerly known as the Bacula Conference [2], the Bareos project was introduced to an interested audience.
Before you start working with Bacula or Bareos or start planning a test installation, you should take a look how the tools function (Figure 1). The basic structure always consists of a control unit, the Backup Director, one or more Storage daemons, and the File daemons on the clients to be backed up.
The File daemons are responsible for backing up the data from the client or restoring the data on the client again. This daemon runs permanently on the clients and carries out the Director's instructions.
The Director is the controller: It contains all the logic and accounts for most of the settings. Its configuration file describes the following:
- The database configuration
- All client systems and how they are addressed
- Which files should be backed up (a FileSet)
- The plugin configuration
- The before and after jobs (i.e., programs that are started before or after a backup job, e.g., to start and stop services)
- The storage and media pool with its properties and retention times
- The backup schedules
- Addresses for messages
- Jobs and JobDefs (job defaults)
Defining storage, a FileSet, and a client is not enough. These components are brought together by jobs, which define what is where and when to back it up.
The retention period for the backup data is controlled by File Retention, Job Retention, and Volume Retention periods. It makes sense to use only Volume Retention to control the retention times, because if several retention options overlap, you might experience surprising effects.
Volume Retention is defined per pool. By defining several pools, you can also work with different retention periods, such as for different systems or different backup types (e.g., full, differential, or incremental). The specified periods are the minimum retention periods.
Improved Usability
One focus in Bareos's development is keeping the obstacles for newcomers as low as possible. Because newcomers are usually overwhelmed by configuration options, the Bareos project offers package repositories for popular Linux distributions and Windows [3]. For Windows, additional packages for the OPSI [4] software management solution are also offered. All versions are built automatically by the project's own instance of the Open Build Service (OBS) [5]. In comparison, Bacula.org offers only the source code, and Windows binaries are only available for cash.
On Linux, you just need to add the appropriate repository to install a Bareos server and then install the Bareos packages. Bareos supports three database back ends: MySQL, PostgreSQL, and SQLite. SQLite should only be used for test installations.
Most optimization effort in the future will flow into the PostgreSQL connection. To ensure that the desired back end really is installed, you need to select the packages bareos
and bareos-database-postgresql
(or bareos-database-mysql
, if you prefer).
The database must be installed separately; Bareos only contains dependencies on database clients. This makes it possible for the database to run on a computer other than the Bareos server itself.
Unlike Bacula, Bareos defines the database to be used in the configuration file. In Bacula, you must build a version specifically for the respective database.
When you first install Bareos, it populates the configuration files in the /etc/bareos
directory with meaningful values. After the installation, the admin needs to initialize the database and start the services (Listing 1).
Listing 1
Starting Services
su postgres -c /usr/lib/bareos/scripts/create_bareos_database su postgres -c /usr/lib/bareos/scripts/make_bareos_tables su postgres -c /usr/lib/bareos/scripts/grant_bareos_privileges service bareos-dir start service bareos-sd start service bareos-fd start
In the automatic configuration, the backup is to disk by default (in /var/lib/bareos/storage
). Bareos backs up to disk in exactly the same way as it backs up to a tape library. That is, files are created below /var/lib/bareos/storage
, each corresponding to a tape. The advantage of this method is that uniform rules apply and retention hold times are handled in the same way for tapes and disks. The maximum file size and the maximum number are defined in the Director daemon in the pool resource (i.e., the /etc/bareos/bareos-dir.conf
file).
To create a virtual tape, you need to start the bconsole
program, which welcomes you with an asterisk prompt. After running label
and assigning a name (in this example, file1
), press 2
for the defined File
pool (Listing 2). With status director
, you can view the next scheduled jobs (Listing 3).
Listing 2
Labeling the Virtual Tape
*label Automatically selected Storage: File Enter new Volume name: file1 Defined Pools: 1: Default 2: File 3: Scratch Select the Pool (1-3): 2 Connecting to Storage daemon File at bareos:9103 ... Sending label command for Volume "file1" Slot 0 ... 3000 OK label. VolBytes=186 Volume="file1" Device="FileStorage" (/var/lib/bareos/storage) Catalog record for Volume "file1", Slot 0 successfully created. Requesting to mount FileStorage ... 3001 OK mount requested. Device="FileStorage" (/var/lib/bareos/storage) *
Listing 3
Status Display
*status director Scheduled Jobs: Level Type Pri Scheduled Name Volume ===================================================== Incremental Backup 10 18-Jul-13 23:05 BackupClient1 file1 Full Backup 11 18-Jul-13 23:10 BackupCatalog file1 ...
The backups are set in the configuration file to 23:05 hours (BackupClient1
: filesystem) and 23:10 hours (BackupCatalog
: backup of the database itself) To perform a test backup, you can launch it with the run
command, specifying only which client you want to back up. The results are displayed by calling the status director
command (Listing 4).
Listing 4
Status Director
*status director ... Terminated Jobs: JobId Level Files Bytes Status Finished Name ===================================================== 1 Full 135 6.679 M OK 18-Jul-13 16:00 BackupClient1 2 Incr 0 0 OK 18-Jul-13 16:01 BackupClient1 ...
The status scheduler
command shows when jobs are scheduled, and status scheduler days = 365
does this for an entire year in advance.
Improvements
Except for the installation, a number of other improvements make life easier for the Bareos administrator: Anyone who has ever worked with Bacula configuration files will be glad that, with Bareos, almost everything is predefined with sensible default values. In contrast to Bacula, Bareos also supports presets for string values, which means no more worrying about entering the Pid Directory
and Working Directory
directives in the File daemon configuration on the client. Bareos sets meaningful values for the appropriate platform when it creates the packages.
On Windows systems, you can now easily back up not just one, but all connected drives (Windows Drive Discovery). Bacula only supports this in the commercial version. The Volume Shadow Copy Service (VSS) call now discovers Windows drives automatically.
The use of tape libraries has been simplified. Tapes can now be moved from one slot to another within bconsole
. Also, any existing Import/Export slots can be addressed conveniently using the import
or export
commands.The tray monitor (a small icon in the system tray of the taskbar) runs on Windows and on Linux systems. The icon flashes to indicate that a backup is currently running on the system.
If a Backup job fails, you can easily to start a job with exactly the same parameters:
*rerun jobid=id
The backup administrator must ensure that all relevant data are retained for a specific period of time. For example, tax-related data might require a retention period of up to 10 years; you must plan carefully.
If you want to separate the data according to various properties, you can use pools in Bareos to do so. Sizes and retention times can be defined for the pools.
Complex Environments
Sometimes, calculating how big a backup will be is difficult. A first approach is to exclude certain directories and data types in the file lists that describe the backup. Alternatively, you can exclude files above a certain size. However, exclusion does not guarantee that a client does not accumulate large amounts of data that needs to be backed up.
Bareos has a client quota that lets you determine the total amount of data to back up for a client. Additionally, you can use soft quotas and grace periods to learn at an early stage when a quota is nearly exhausted.
Keep in mind that large amounts of data might be transported across the network, especially during a full backup. Therefore, Bareos's ability to limit the maximum network bandwidth used per client is useful. The directive Maximum Bandwidth Per Job
needs to be added to the corresponding client entry in /etc/bareos/bareos-dir.conf
:
Client { Name = client2-fd Address = client2 Password = "secret" Maximum Bandwidth Per Job = 512 k/s }
A key innovation is direct support for NDMP (Network Data Management Protocol), the native backup protocol of large NAS devices such as NetApp. Bareos version 12.4 supports full backup and restore, although restoring individual files is still in the testing phase.
A new plugin for backing up Microsoft SQL Server databases has been written that supports full, incremental, and differential backups; it also is in the testing phase.
The next project in the pipeline is backing up virtual machines via the VMware vStorage API. The first steps have already been taken.