Automation with Ansible
IT as Planned
The objective of DevOps, the combination of software development and IT operations, is to allow close cooperation between software developers and operating staff so they can learn from one another. One central aspect is automatic provisioning of the infrastructure and general software configuration distribution. Well-known products that achieve this goal and have been around for some time include Chef and Puppet. Ansible is a relatively new provider in this field; it offers similar functions but tends to target software developers less and experienced administrators more.
The imperative approach, which Ansible uses to formulate the task to be automated, is more intuitive – to my mind – and it involves fewer surprises in complex setups than the iterative approach of Puppet, for example. In a nutshell, Ansible focuses on "how," whereas Puppet concentrates on "what" in the form of the desired target state of the system and building on the ability of agents to create this state – with the current system state acting as a starting point. If undesirable effects or problems occur, however, the greater degree of abstraction can make debugging complex and difficult to understand.
The minimum requirement for Ansible is simply an installed operating system (Ubuntu Linux in my case) that provides SSH access for an administrative user. Because an operating system and SSH access are required, Ansible is explicitly not suitable as a bare metal provisioning tool. In my case, all the newly created machines boot from a lean base image that meets these requirements. As soon as logging in via SSH is possible, Ansible can take the helm.
At run time, all the required components are transferred to the target system using SSH; they are then executed there and removed on completion; however, don't worry – this process is not as slow as it might initially seem. Although dedicated agents increase the complexity of the overall system and need to be maintained and updated after the installation, this overhead does not exist in Ansible. Updates for Ansible are limited to the local installation.
Flexible Inventorying
To be able to provision machines, Ansible needs to know how to reach them. This information is stored in inventories. An inventory is a text file that uses the INI format and collects DNS hostnames, IP addresses, and optionally variable values, as shown in Listing 1.
Listing 1
Sample Inventory
[controllers] control01.baremetal control02.baremetal [nodes] node01.baremetal machine_type=Dell_R510 node02.baremetal machine_type=Dell_R720 [baremetals:children] controllers nodes
In this example, two host groups named controllers
and nodes
are defined, each with two members. All of the elements in these two groups are in turn collected in the other group, baremetals
, thanks to the children
keyword. A group of machines like this offers a simple and extremely flexible way of organizing the infrastructure.
However, as the infrastructure becomes increasingly dynamic and machines – virtual or physical – are added or removed, this approach reaches its limits. At some point, it becomes impossible to manage all the machines manually in the static text file. For situations like this, Ansible offers the option of running programs to generate inventories, both to supplement static entries and as the sole data source. Prebuilt modules exist for a variety of external products that can contain reusable information about the system landscape (e.g., Amazon EC2 or Zabbix). It is easy to add more integration features; after all, this is a simple question of generating the JSON data structure that Ansible expects.
For example, I wrote my own Python script to query information relating to virtual machines in OpenStack Nova from the machines' metafiles and output this information in a format suitable for Ansible. This capability means that you can create or delete machines as needed without having to change your Ansible inventories. Details of the available default modules and a link to the developer documentation are provided on the Ansible website [1].
Controlling Ansible
The following examples assume that the computers you use are addressable by their DNS names and that a local user by the name of local-admin
exists and can connect using SSH and a key file without a password. Being able to log in without a password is not a prerequisite for the use of Ansible, but it does make your daily work much easier.
After inventorying and grouping the target hardware, you can now move on to the actions to be performed against it. Ansible formulates these actions in Playbooks – text files in an easily understandable, structured YAML format.
My first example will automate the installation of an enterprise-wide root CA certificate to be able to validate all kinds of TLS certificates, software packages, and so on against it. In other words, this first playbook needs to ensure that the required files are transferred to each host and installed there in a suitable way. The following listing shows the playbook site.yml
, which is normally the master playbook and uses include
to integrate other playbooks. In the example, I am not using this style to keep things simple:
--- hosts: baremetals roles: - base
The hosts
line points to the host group defined in the inventory. In practical terms, this means that all controllers and nodes execute the tasks that follow. The interesting thing here is that, thus far, neither certificates nor any commands have been mentioned. The reason is that normally it makes sense to organize tasks in smaller, reusable units, which Ansible refers to as roles
.
In my example, this means that all members of the baremetals
group are currently assigned a single role, triggered by the roles
keyword. This role goes by the name of base
; it creates the basic preconditions for many other steps later on and looks like Listing 2.
Listing 2
Base Role
--- - name: Install CenterDevice Root CA Certificates sudo: true copy: src=usr/local/share/ca-certificates/{{ item }} dest=/usr/local/share/ca-certificates/{{ item }} with_items: - centerdevice-intermediate-ca.crt - centerdevice-root-ca.crt - name: Update root certificate database sudo: true command: update-ca-certificates
After a brief learning curve, the YAML structure is easy to read. Two tasks are performed during provisioning. Each has a name that designates it in the logs and can be used to point explicitly to the task. Names are not mandatory, but intuitive names do make playbooks easier to understand and maintain.
The first task installs two CA certificate files in the target directory, /usr/local/share/ca-certificates/
. Because the target path is not writable for standard users, sudo:
true
ensures that the remote command is run with root privileges. The copy:
line transfers the local file to the remote computer. To make the whole thing more interesting, I am using a variable item
here and processing a set of files in a loop. The variable is populated with each value from the with_items:
list in the next line. This is all I need to do: Ansible checks whether the target path exists and copies the two files. If one of the files already exists (and has the same content), the copy command is ignored. Note that variables can also have more complex structures and are not restricted to strings only.
Most Ansible tasks come with a number of additional parameters that let you modify their behavior. For example, you can define the owner and access privileges for the file while copying. After the copying action, the new certificates are processed as the next task. Ubuntu needs to call update-ca-certificates
on the remote computer to do this. Again sudo:
true
ensures that the required privileges are in place.
Using a Sample Playbook
With the inventory, the first playbook, and the role, you now have all the preconditions for watching Ansible work. Ansible consistently uses convention over configuration, so you need to store the previously mentioned files in a directory tree on the local workstation as shown in Listing 3.
Listing 3
Directory Tree
ansible-demo-scripts |- inventories | \- hosts.baremetal |- roles | \- base | |- files | | \- usr | | \- local | | \- share | | \- ca-certificates | | |- centerdevice-intermediate-ca.crt | | \- centerdevice-root-ca.crt | \- tasks | \- main.yml \- site.yml
You will find the base
role name in the directory tree below the roles
folder, and the source files to be copied for this role below the files
folder. The actual tasks for the role are stored in main.yml
below tasks
. For more details of the directory tree conventions and recommendations relating to them, check out the excellent Ansible documentation [2].
Listing 4 shows how I execute ansible-playbook
in the ansible-demo-scripts
directory. The --ask-sudo-pass
parameter is only required if the local-admin
user is not authorized to run sudo
without a password. Additionally, if ~/ .ssh/config
already contains the name of the remote user, the -u
parameter can be omitted. The -i
option (or the longer form --inventory-file
) defines the names of the desired inventories; I have just one inventory here. The final argument designates the playbook (site.yml
).
Listing 4
Executing ansible-playbook
ansible-demo-scripts$ ansible-playbook -u local-admin --ask-sudo-pass -i inventories/hosts.baremetal site.yml sudo password: ****** PLAY [baremetals] *** GATHERING FACTS *** ok: [control02.baremetal] ok: [node02.baremetal] ok: [control01.baremetal] ok: [node01.baremetal] TASK: [base | Install CenterDevice Root CA Certificates] *** changed: [control01.baremetal] => (item=centerdevice-intermediate-ca.crt) changed: [control02.baremetal] => (item=centerdevice-intermediate-ca.crt) changed: [node01.baremetal] => (item=centerdevice-intermediate-ca.crt) changed: [node02.baremetal] => (item=centerdevice-intermediate-ca.crt) changed: [node02.baremetal] => (item=centerdevice-root-ca.crt) changed: [control01.baremetal] => (item=centerdevice-root-ca.crt) changed: [node01.baremetal] => (item=centerdevice-root-ca.crt) changed: [control02.baremetal] => (item=centerdevice-root-ca.crt) TASK: [base | Update root certificate database] *** changed: [control02.baremetal] changed: [node01.baremetal] changed: [control01.baremetal] changed: [node02.baremetal] PLAY RECAP *** control01.baremetal : ok=3 changed=2 unreachable=0 failed=0 control02.baremetal : ok=3 changed=2 unreachable=0 failed=0 node01.baremetal : ok=3 changed=2 unreachable=0 failed=0 node02.baremetal : ok=3 changed=2 unreachable=0 failed=0
The output PLAY
[baremetals]
shows the host group that was addressed at the outset as defined in site.yml
. In more complex playbooks, you can use different groups, of course. This is followed by the GATHERING FACTS
section. At the start of each run, Ansible collects a set of data for all computers with which it connects. This includes the host name, IP addresses, names of network interfaces, time zone, hardware information, and many other facts. The documentation contains a comprehensive list of the available facts [3]. Beyond this, you can add more facts through your own extensions or in playbooks. The information stored in this step is available downstream in the playbook execution flow for more complex behaviors.
The nonlinear sort order of the ok: ...
lines and fact collection is due to Ansible opening connections to multiple hosts at the same time (five by default) to speed up execution. Depending on response times, the order can change from run to run. However, you can rely on all hosts completing a task before moving onto the next task in the playbook.
After acquiring the facts, the tasks that are prescribed by the base
role are executed. Again, the order of feedback depends on the performance or the network connection to the remote computer. Because two files are being copied to four computers, there are a total of eight lines in the logfile. The changed:
prefix means that all the computers have received the specified file and that it did not previously exist – or contained something different.
After this, Ansible proceeds to the next task and runs the command for successively updating the certificate on each host. Because Ansible is unaware of the effect of running this external command on the remote system, it tags the task as changed:
. The criteria for detecting a change – whether through the command's exit code or its output – can be modified as needed.
At the end of the run, the PLAY RECAP
section contains an overview of the program execution. In this case, the counters are identical for all the hosts because none of the tasks failed, all of the computers were reached, and they all performed the same task.
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.