Flexible backup for large-scale environments

Bats in the Data Center

Restore

The very purpose of backups is to restore data – be it individual files that have been accidentally deleted or entire filesystems that have been lost because of a system failure or a ransomware attack.

To initiate a restore in the Bacula system, you must have access to the Director. Either an administrator starts the restore after a user has posted a corresponding request to the helpdesk, or the client offers the option of accessing the Director with the Bacula Console bconsole utility and running the required commands there. Access control lists (ACLs) can be used to define which commands are accessible from the Console.

In this context, it is advisable to prevent clients from initiating backups and only allow restores (e.g., to prevent the backups stored on the server being overwritten by uncontrolled backups that could be initiated by a malware-infected system).

Reporting

One of Bacula's great strengths is its reporting. In addition to predefined reports, further options are available thanks to documented database structures. You can also use bdirjson for read access to the active configuration or run scripts from bconsole. The SQL query from Listing 3, embedded in your choice of scripting language, provides an overview of the clients that require the most space, all told, and optionally match a name pattern.

Listing 3

SQL Query

SELECT
  Client.Name,
  sum(JobFiles) AS Files,
  sum(JobBytes) AS Bytes
FROM
  Job, Client, Pool
WHERE
      Client.Name ~* '$pattern'
  AND Client.ClientId = Job.ClientId
  AND Job.PoolID = Pool.PoolID
  AND Job.Type = 'B'
  AND JobStatus IN ('T', 'W')
GROUP by Client.Name
ORDER by Bytes DESC
LIMIT $number;

A self-developed tool uses queries to detect anomalies. One of the most annoying problems occurs when external devices or filesystems are to be backed up on the client but are not mounted at the time of the backup. Bacula interprets this as a deletion of the data and marks the data as deleted in the database at this point in time. If the data can be accessed again during the next backup on the client, they are all backed up again. This operation can take a long time and generates a noticeable load on the client.

You can counteract such problems with appropriate plausibility checks. For example, if more than 5,000 files have been deleted and less than 500KB of data has been backed up, something is probably wrong. Incidentally, the problem can be easily solved by early detection. The backup job that marked the files as deleted is itself deleted and the previous state is restored.

This test protects the clients, and others prevent unnecessary resource consumption and potential issues at restore time. Occasionally, the files belonging to a database or the file used by a virtual machine (VM) as a virtual hard drive are inadvertently backed up. Depending on the database management system (DBMS) or VM load, both can change considerably over the duration of the backup job and will therefore be inconsistent. Also important is for the system operation to detect at an early stage whether or not systems exceed certain thresholds. Listing 4 shows an example of the output from this kind of check.

Listing 4

Early Detection

Job   Level Jobs    Avg.Time   Avg.Files  Avg.Bytes
===================================================
job-1     I   32    02:40:51     343,351   351.6 GB
job-1     F    1  4-18:50:41  30,734,592    34.2 TB
job-1    VF    1  1-05:54:54  28,946,254    35.2 TB
(C001) latest/average full-backup size is above acceptable limit of 25TB
(C101) latest/average VFull-backup size is above acceptable limit of 25TB
(W102) last VFull runtime is longer then acceptable limit of 22h
(C301) average incremental-backup size is above acceptable limit of 200GB
(W302) at least one incremental-backup size is above threshold of 200GB

As already mentioned, further information can be obtained with the Bacula text-based console (bconsole). A little-known .status storage running command reveals the details of the data throughput for individual jobs; this command is particularly useful here by doing something that otherwise can only be done on an aggregated basis at the system level. The data can be combined and visualized with the other mechanisms (Figures 2 and 3).

Figure 2: The number of backup processes and their status over time.
Figure 3: A visualization of read and write throughput.

Conclusions

Bacula proves to be a powerful backup solution that can efficiently back up large environments. Because it is open source, the formats and interfaces are open. Among other things, this means that, unlike proprietary backup software, the backed-up tapes can be read at any time, even if the license agreement is terminated at some point. Bacula also offers many options for automating processes and retrieving all kinds of useful information. When deciding on a new backup solution, you will definitely want to put Bacula on your shortlist.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus