Multicloud management with Ansible


Defining VM Specifications

The EC2 module for Ansible can roll out multiple VMs at once with a single command. However, they are then all the same size and are based on the same template (Figure 1). However, you want to remain flexible and be able to assign individual sizes and templates to the VMs. You also want to provide them with additional information for the application rollout that follows. In the example, I will be assigning the machines a purpose (e.g., database or webserver). Listing 1 shows the continuation of ec2_vars.yml.

Listing 1

AWS VM Specs ec2_vars.yml

     type: t2.large
     image: ami-0b416942dd362c53f
     disksize: 40
     purpose: database
     type: t2.small
     image: ami-0b416942dd362c53f
     disksize: 20
     purpose: webserver
Figure 1: GUIs and functions of cloud providers differ greatly. AWS, for example, registers a domain name system (DNS) name for new VMs, but a customized, mnemonic name cannot be assigned.

Ansible refers to the hierarchical variable structure of vms as a dictionary, which can later be used to build a loop. The example creates two Fedora minimal VMs with different disk and VM sizes. Of course, the vms dictionary can list many more machines, and you can add more variables that will end up in the inventory later. In a loop over vms, the items end up in the following Ansible variables,

   one: -> {{ item.key }}
     type: t2.large -> {{ item.value.type }}

that tasks can access within the loop playbook.

Completing the Inventory

The logic of the rollout_ec2. yml playbook (Listing 2) first creates the machines and adds all the necessary information to the static inventory. The playbook runs on the local host and first creates the static inventory file with its header; then, it loops over the dictionary for each VM to be created, calling the ec2_loop.yml playbook (Listing 3) each time.

Listing 2

Playbook rollout_ec2.yml

- hosts: localhost
   gather_facts: False
   - name: Create Hosts file
        path: "{{ ec2_hostfile }}"
        state: touch
   - name: hosts file Header
        path: "{{ ec2_hostfile }}"
        line: '[ec2]'
   - name: Loop VM Creation
     include_tasks: ec2_loop.yml with_dict: "{{ vms }}"

Listing 3

Playbook ec2_loop.yml

- name: Launch EC2 Instances
     access_key: "{{ ec2_access_key }}"
     secret_key: "{{ ec2_secret_key }}"
     region: "{{ ec2_region }}"
     key_name: "{{ ec2_key_name }}"
     instance_type: "{{ item.value.type }}"
     image: "{{ item.value.image }}"
     vpc_subnet_id: "{{ ec2_vpc_subnet_id }}"
     group_id: "{{ ec2_security_group_id }}"
     count: 1
     assign_public_ip: yes
     wait: true
       - device_name: /dev/sda1
         volume_size: "{{ item.value.disk size }}"
         delete_on_termination: true
   register: ec2_return

The first task uses Amazon's EC2 module and creates the VM. Because you start the module once per loop, you set the VM count statically to 1. The wait: true switch is, unfortunately, necessary. From time to time, the automation runs faster than AWS itself, then the task without a wait completes before Amazon has even assigned a public IP address to the machine. Of course, the automation needs this missing IP address. At the end, register saves the return values of the EC2 module in the variable ec2_return:

- name: Debug ec2_return
     msg: "{{ ec2_return }}"
     verbosity: 2

The variable ec2_return is a JSON construct. It outputs the debug command fully on the command line, which helps Ansible developers determine the information they need in JSON during an initial trial run and then be able to use the correct variable structure for the rest of the process. You can remove the task in the finished playbook, of course. Here, it is simply included with the verbosity: 2 flag. In this case, this step will only be executed if you start the playbook with -vv (i.e., debug level 2). Now complete the ec2_loop.yml playbook (Listing 3) with the following lines:

- name: Loop Output to file
     path: "{{ ec2_hostfile }}"
     line: '{{ ec2_return.instances[0].public_ip }} vmname={{ item.key }} privateip={{ ec2_return.instances[0].private_ip }} purpose={{ item.value.purpose }}'

After debugging the return array (a variable that stores multiple values with an index), you now know where information like public and private IP addresses are in the array; Ansible appends them to the static inventory with lineinfile. The ec2_return_instance[] variable is an array because the ec2 module can create multiple VMs in a single pass. For example, after a successful rollout of the two test VMs on AWS, the inventory created in this way will have the following information:

[ec2] vmname=one privateip= purpose=database vmname=two privateip= purpose=webserver

What ends up in the inventory is up to the user. You can, of course, add more information from the loop or variable declaration to the inventory that the later application rollout will be able to use. In the case of AWS, for example, it makes sense to write the ID of the machine from the array ec2_return_instance- to the inventory, too. A rollout playbook doesn't need this parameter, but a playbook that deletes the VMs in the AWS cloud after using the service needs to know these IDs.

And with Google

Google Cloud Platform organizes resources differently from AWS. It sorts VMs into projects and regulates networks and firewalls somewhat differently (Figure 2). The associated Ansible modules create the VMs in several steps. However, the end result is an identical inventory to that after the AWS rollout. The gcp_vars.yml variable declaration is shown in Listing 4.

Listing 4

GCP VM Spec gcp_vars.yml

gcp_credentials_file: credential-file.json
gcp_project_id: myproject-123456
gcp_zone: us-central1-a
gcp_hostfile: gcp_hosts
     type: e2-standard-4
     image: projects/centos-cloud/global/images/family/centos-8
     disksize: 40
     purpose: database
     type: e2-standard-2
     image: projects/centos-cloud/global/images/family/centos-8
     disksize: 20
     purpose: webserver
Figure 2: Google not only sorts VMs by region but also manages machines in projects. The admin gives each machine a name when it is created, but GCP does not automatically register a DNS resolution.

Parameters such as networks and firewall rules are optional, and GCP sets the default values of the project. The vms dictionary looks almost identical to the AWS declaration. Only the names and paths for image-type and template-source use different values. The GCP rollout script, roll-out_gcp.yml (Listing 5) contains basically the same statements as for the AWS rollout, except the loop playbook for GCP differs significantly from the AWS rollout, as Listing 6 shows.

Listing 5

Playbook rollout_gcp.yml

- hosts: localhost
   gather_facts: False
   - name: Create Hosts file
     path: "{{ gcp_hostfile }}"
   state: touch
   - name: hosts file Header
        path: "{{ gcp_hostfile }}"
        line: '[gcp]'
   - name: Create GCE VMs
     include_tasks: gcp_loop.yml with_dict: "{{ vms }}"

Listing 6

Loop Playbook for GCP

- name: create a disk
     name: "{{ item.key }}-disk"
     size_gb: "{{ item.value.disksize }}"
     source_image: "{{ item.value.image }}"
     zone: "{{ gcp_zone }}"
     project: "{{ gcp_project_id }}"
     auth_kind: serviceaccount
     service_account_file: "{{ gcp_credentials_file }}"
     state: present
   register: disk
- name: create a instance
     name: "{{ item.key }}"
     machine_type: "{{ item.value.type }}"
     - auto_delete: 'true'
        boot: 'true'
        source: "{{ disk }}"
     - access_configs:
        - name: External NAT
           type: ONE_TO_ONE_NAT
     zone: "{{ gcp_zone }}"
     project: "{{ gcp_project_id }}"
     auth_kind: serviceaccount
     service_account_file: "{{ gcp_credentials_file }}"
     state: present
   register: gcp_return

If you want to determine the disk size of the VM yourself, you first need to create and adapt a disk from the template in a separate gcp_compute_disk task. Without this separate task, the following module would create a disk of the size stated in the OS template. In the second task, GCP then builds the VM with the previously created disk. In GCP, the administrator can specify the name of the machine. Again, this does not matter for the upcoming application rollout, but the VM name in a later playbook identifies the VMs to be deleted:

- name: debug output
     msg: "{{ gcp_return }}"
     verbosity: 2

As previously with AWS, you need to analyze the JSON structure of the return code in a first test run to extract the correct parameters in the next step and add them to the playbook (Listing 6):

- name: Add Instance Data to Host File
    path: "{{ gcp_hostfile }}"
    line: '{{ gcp_return.networkInterfaces[0].accessConfigs[0].natIP }} vmname={{ item.key }} privateip={{ gcp_return.networkInterfaces[0].networkIP }} purpose={{ item.value.purpose }}'

The GCP return variable also contains arrays. In GCP, VMs with multiple network adapters can be built in an automated process; the neutral inventory file gcp_hosts,

[gcp] vmname=one privateip= purpose=database vmname=two privateip= purpose=webserver

is also available in GCP at the end.

