Many Clouds, One API

With the recent rise in cloud computing, most cloud providers have offered their own APIs, which means cloud users sign up for the services of individual providers at the expense of being able to migrate easily to other providers at a later stage. Apache Deltacloud addresses this issue by offering a standardized API definition for infrastructure as a service (IaaS) clouds with drivers for a range of different clouds.

With the recent rise in cloud computing, most cloud providers have offered their own APIs, which means cloud users sign up for the services of individual providers at the expense of being able to migrate easily to other providers at a later stage. Apache Deltacloud addresses this issue by offering a standardized API definition for infrastructure as a service (IaaS) clouds with drivers for a range of different clouds. The Deltacloud API is designed as a RESTful web service and comes with client libraries for all major programming languages. Additional drivers for accessing further public or private clouds can be created with minimal effort.

Despite plenty of discussion to the contrary, cloud computing still has very few standards. Instead, the emphasis is on development. Many different public clouds have been created over the last few years, including Amazon Elastic Compute Cloud (EC2), Amazon Simple Storage Service (S3), and JiffyBox. In the US, they are joined by providers such as GoGrid, Rackspace, and Terremark (Figure 1). Clouds are still a relatively new method for offering applications and services.

Figure. 1. Apache Deltacloud supports private clouds and a range of public clouds, including Amazon EC2 (Eucalyptus is an open source solution compatible with the Amazon cloud), Microsoft Windows Azure, and VMware vCloud. (Source: Red Hat)

Providers, as well as suppliers of proprietary technologies, like to follow their own ideas on how a cloud and the applications within it should be accessed, operated, and managed. Business customers, on the other hand, are often reluctant to be tied to a single cloud provider. Interoperability, or the ability to migrate smoothly from one provider to another, is therefore becoming more and more important.

Although an impressive number of public clouds are on offer, many businesses choose to provide applications and computing services in their own private clouds using their internal LAN/WANs. Once the internal clouds reach their limits, it would be highly advantageous if public clouds could then be accessed for additional resources. Linking these two spheres, however, has long been fraught with difficulties.

A consistent, standardized cloud computing API model would solve the problem of software developers having to do further programming whenever a new cloud with a proprietary API is introduced. Even with Amazon’s cloud projects such as EC2 and S3 providing fully functional APIs for their own activities, they are in no way a workable solution for other public clouds. In fall 2009, Red Hat decided to address this situation by creating Deltacloud, an open source project that defines a standardized interface and includes adapters for all the major public and private clouds. Support for new clouds does not need to be handled by the applications. Deltacloud takes care of it directly.

Although Deltacloud was originally a Red Hat project, in spring 2010 the interface and all related code were transferred to the Apache Software Foundation’s Incubator (http://incubator.apache.org/deltacloud/) for further development. This new basis ensures ongoing vendor independence. Like other Apache projects, Deltacloud is being developed jointly by many participants from different businesses and organizations that are committed to the principles of open software licensing and user-driven innovation.

REST-Based Interface

The Deltacloud API is implemented via HTTP as a service-based REST (representational state transfer) interface (Figure 2). All data is communicated via the REST interface to a Deltacloud server, which similarly has a REST interface. To simplify operation of the REST interface, the Deltacloud project provides a CLI (command-line interface) tool, as well as client libraries in Ruby, Java, C, and Python.

Figure 2. Apache Deltacloud provides a REST-based API for communication between clients and the Deltacloud server (Deltacloud Core). (Source: Red Hat)

Deltacloud is not the only open source project to develop cloud abstraction APIs; other solutions include jclouds, libcloud, boto, and fog. However, all these libraries are tied to specific programming languages – jclouds to Java, libcloud and boto to Python, and fog to Ruby. These options all consist of language-specific libraries. Deltacloud, on the other hand, is entirely independent of languages and is the only cloud abstraction API that can also be used as a web service. The advantage of this approach is that through the use of widely accepted and existing standards such as HTTP and XML, an open architecture is created independent of platforms and programming languages.

The conception of the Deltacloud API as a web service instead of a library makes it possible to operate the Deltacloud server in one of two base configurations – close to the user, such as on a local computer/LAN, or close to the cloud provider’s native API, which is particularly interesting for private clouds. Of course, providers can also use Deltacloud directly as their sole interface.

The Deltacloud server was written in Ruby or, more specifically, using the Ruby framework Sinatra. Internally, the server code comprises two main sections. The first, generic section is dedicated to typical web service tasks such as receiving and deserializing HTTP requests and formatting responses. The second section provides drivers for individual clouds such as Amazon EC2, vCloud, Azure, and so on. The two sections are linked through a simple internal interface. This makes it possible to create drivers for new cloud APIs without having to engage much with the server code. Experience has shown that a new driver can usually be implemented within a few days.

In simplified terms, a process flow typically begins with a client sending a request to the server via the REST interface. The Deltacloud server’s driver then relays the request to the dedicated cloud. The server itself remains stateless; it does not store any status or session details. Instead, the client sends the access data required for the selected cloud via the header for HTTP Basic authentication as part of every request. Whereas the cloud to be accessed and its API URL previously had to be specified during the Deltacloud server’s startup, the next version of Deltacloud will make it possible to select both via additional HTTP request headers. In this way, a single Deltacloud server can be used to address any number of clouds.

API Principles

The danger of defining abstraction APIs is that the abstract API only provides the relatively small number of functionalities shared by all the APIs it is based on. Deltacloud circumvents this issue by defining a basic interface that is supported by all the drivers and that also permits driver-specific extensions. The compatibility of these extensions with a specific driver can be detected via a range of simple mechanisms. These detection mechanisms are designed so that the client does not need to be aware of the cloud that it is connected to via Deltacloud; all questions pertaining to cloud-specific differences are resolved via detailed information in the responses from the Deltacloud server.

The API uses exactly one entry point, which is considered best practice in REST environments. The XML document accessed via the entry point’s URL contains detailed information about all the resources available through the Deltacloud server, including images, realms, hardware profiles, and so on. In addition to the URLs for each of these resource collections, the XML document includes information about driver-specific extensions. For example, clients are provided with information about whether the cloud permits user-defined data to be injected at startup into new VM instances, and if so, what mechanisms are used for this. This operation is crucial for personalizing new VM instances. Unfortunately, it is not supported by all cloud providers, and those that do support it, usually have their own unique methods.

The different states through which a virtual machine runs over its lifecycle vary significantly from cloud to cloud. Differences exist not only in the naming of logically equivalent states and operations, but also in the quantity and sequence of the states (Figure 3). Because of this, Deltacloud provides clients with a standardized model of the cloud-specific lifecycle, formatted as a finite state machine. In addition to smoothing out naming differences, a client can also detect which operations are necessary for, for example, pausing or removing a currently active VM.

Figure 3. Functions for monitoring the specific status details of a VM instance are particularly useful. (Source: Red Hat)

Even when the general lifecycle of a cloud is known, it can be difficult to identify exactly which operations (e.g., Pause, Stop) are permitted on a virtual machine. A large number of attributes need to be taken into consideration, many of which exist only inside the cloud, such as permissions. Deltacloud simplifies this situation for clients by indicating for each VM exactly which operations it permits – for example, whether the current user is permitted to pause the VM.

Within a cloud, a “realm” describes a specified area that can access selected resources. Every cloud provider defines such areas individually. A realm can represent different data centers, regions, or even just resource pools within a single data center. Cloud providers can also stipulate conditions for realms – for example, that all resources to be used together need to be located within the same realm. Among others, this can apply to storage systems, which might only be linked to VM instances if they meet this condition.

Another area in which clouds tend to differ significantly is the size of the virtual machines made available. The options on offer vary not only in terms of number of virtual CPUs, memory size, and local storage, but also in terms of whether the user is restricted to fixed values and parameters when creating a virtual machine. For example, one provider might only offer virtual machines with a fixed 1GB of RAM, whereas another could allow the user to set up memory of between 1 and 8GB in increments of 512MB.

The Deltacloud API bundles all these different possibilities into hardware profiles, which means the clients are provided with a complete list of all possible VM values and user-definable parameters. In this way, clients do not need to know exactly how a specific cloud defines the size of a new VM; the only model they need to understand is the Deltacloud hardware profile.

Outlook

Apache Deltacloud is continually being developed by an active community of programmers. A major aspect of development is compatibility with cloud-oriented storage systems such as S3 and Cloudfiles. A further priority is the creation of new drivers, particularly for vCloud, Red Hat Enterprise Virtualization Manager (RHEV-M), and Google Storage. These extensions are designed to be fully backward compatible, allowing older clients to communicate flawlessly with newer servers. Such API stability comprises not only the parameters of the HTTP requests and the XML format of the responses, but also data, such as error codes and messages reported by the API.

Figure 4. Cloud Engine and the Deltacloud API: Red Hat’s Cloud Engine is used for implementing and operating private clouds that communicate with public clouds. All incoming and outgoing communications are handled via the Deltacloud API. (Source: Red Hat)

Deltacloud currently is being used in several major projects, including Red Hat’s Cloud Engine, SteamCannon, and Eclipse. The goal of Cloud Engine is to implement an open source cloud broker (Figures 4 and 5). The Deltacloud API is used here to facilitate views on the virtual clouds of each user, as well as for communication with the clouds themselves. SteamCannon creates tools for operating user-specific Platform-as-a-Service (PaaS) clouds for Java and Ruby applications. The Eclipse project is used to develop plugins that aid the management of virtual machines in clouds from within Eclipse, enabling developers to implement even complex application scenarios and architectures with a few simple mouse clicks.

Figure 5. Red Hat’s cloud architecture makes it possible to integrate extremely different virtual systems and public clouds and to administer them together. (Source: Red Hat)

Installing Deltacloud

To install the Deltacloud server, use one of two methods. On Fedora and related Linux distributions, Deltacloud is available as RPMs and can be installed with a simple

yum install deltacloud-core-all

On other operating systems, you need to install Deltacloud as a Ruby gem with the command:

gem install deltacloud-core

The drawback of gem install is that a C compiler must be present because some of the gems Deltacloud depends on need to be compiled.

After installation, the Deltacloud server can be started with

deltacloudd -i mock -r  HOSTNAME  -p  PORT

where HOSTNAME is the hostname or IP address on which the server should be listening. The Deltacloud RPM contains an init script to do this, which is configured by editing /etc/sysconfig/deltacloud-core, but for the sake of explanation, I’ll start the server directly. Under the covers, deltacloudd uses thin, which only supports HTTP, not HTTPS; to get secure connections to a Deltacloud, the server requires those connections to be proxied through another web server like Apache or nginx.

When starting the server, deltacloudd expects a default driver. In this example, I use the mock driver, which just pretends to be a cloud and is useful for testing. But any supported driver, like ec2 or rhevm can be specified here.

Exploring the API with Your Browser

Once the server is running, point your browser at http://HOSTNAME:PORT/api. This returns an HTML version of the top-level entry point, listing all the collections the driver knows about. From here, the rest of the API can be explored: You can click on Images to list all the images the current back end knows about. At this point, you will have to enter the credentials for the back-end cloud, in the case of the mock driver, these are mockuser and mockpassword. The Deltacloud documentation lists where to find the credentials for each driver.

The top-level entry point also links to documentation of the API, which is generated from the internal API specification of each server and driver. An important piece of information in the documentation is what parameters each operation requires from a resource. For example, http://HOSTNAME:PORT/api/docs/instances/create describes the instance creation operation, including which feature is responsible for adding which parameter.

From the list of images, you can navigate to a specific image – for example, img1 at http://HOSTNAME:PORT/api/images/img1 – and click on Launch. After filling the resulting form and clicking Create, you are taken to the details of the newly created instance.

The HTML version of the API is really only meant for exploration and experimenting. More serious use will involve the XML or JSON versions of the API; to view the XML for a resource on your browser, you simply append ?format=xml to any URL; for example, going to http://HOSTNAME:PORT/api/instances/inst3?format=xml will give you the XML version of the details about instance inst3.

Switching Drivers with Every Request

Because it is not really practical to run a different server for every cloud, including separate servers for different regions of the same cloud, the Deltacloud server makes it possible to switch drivers on the go – either by including the HTTP headers X-Deltacloud-Driver and X-Deltacloud-Provider in each request, or by setting corresponding matrix parameters in the requested URL. The provider talks to different endpoints of the same cloud (e.g., to different regions of Amazon’s EC2) or to specify the endpoint of a RHEV-M installation.

In this test setup, you can talk to Amazon’s EC2 us-east-1 region by going to the URL http://HOSTNAME:PORT/api;driver=ec2;provider=us-west-1. From the resulting page, you can now follow the exact same steps as above, but this time you will launch a real instance, costing real money, in EC2, rather than a fake one with the mock driver.

In moving from the mock driver to the EC2 driver, it is worth noting the list of available collection changes: The EC2 driver offers additional collections, like Addresses for public IP addresses, Firewalls to manage security groups, and Keys to handle SSH keys. Also, the instance creation form has additional fields for EC2. These are included because the EC2 driver supports certain features, like launching multiple instances or applying security groups for instance creation that the mock driver does not. How the XML for the top-level entry point advertises different features between the two drivers, and how that information is used to provide additional fields in the instance creation form, is worth understanding.

Using the API in Code

Deltacloud also provides a variety of clients. They all encapsulate the HTTP conversation with a Deltacloud server and make it easier to consume the REST-based API in your programs. As an example, I will show a small script that gets the list of all instances from a Deltacloud server and then polls the server for changes in the instance’s state.

Before I can do that, I need to install the Ruby Deltacloud client; with yum, run:

yum install rubygem-deltacloud-client

With gem, use

gem install deltacloud-client

The example script can be seen in Listing 1. After saving the script in example.rb, you can run it with

ruby example.rb URL USER PASSWORD

against your Deltacloud server. The output it produces will be something like this:

Found 4 instances in the following states:
        RUNNING 4
Polling for changes (Ctrl-C to end)
inst1 changed from RUNNING to STOPPED
inst1 changed from STOPPED to RUNNING
^C

The important parts are in line 4 of the listing, which initializes the Deltacloud client with the URL and credentials provided on the command line, and lines 6 and 19, which retrieve a list of all instances and then check each instance for changes. All the minutiae of making HTTP requests and deserializing the responses into objects is handled by the Deltacloud client library.

You can run this script against any server the Deltacloud supports simply by changing the URL you pass in (and the corresponding username and password) as described in the previous section. Running it against the URL http://HOSTNAME:PORT/api;driver=ec2;provider=us-west-1 will watch for state changes in EC2’s us-west-1 region, and running it against http://HOSTNAME:PORT/api;driver=gogrid will do the same for GoGrid. The script could be enhanced easily to watch state changes in all the clouds one has access to.

Listing 1. Get a List of All Instances from a Deltacloud Server and Poll the Server for Changes in State
01 require 'rubygems'
02 require 'deltacloud'

03 URL, USER, PASSWORD = ARGV

04 client = DeltaCloud.new(USER, PASSWORD, URL)

05 summary = Hash.new(0)
06 state = client.instances.inject({}) do |state, inst|
07   state[inst.id] = inst.state
08   summary[inst.state] += 1
09   state
10 end

11 total = summary.values.inject(0) { |sum, i| sum += i }
12 puts "Found #{total} instances in the following states:"
13 summary.keys.sort.each do |s|
14   printf "%20s %d\n", s, summary[s]
15 end

16 puts "Polling for changes (Ctrl-C to end)"
17 loop do
18   sleep 2
19   client.instances.each do |inst|
20    if state[inst.id] != inst.state
21     if state[inst.id]
22      puts "#{inst.id} changed from #{state[inst.id]} to #{inst.state}"
23     else
24      puts "#{inst.id} was created and is now #{inst.state}"
25     end
26     state[inst.id] = inst.state
27    end
28   end
29 end