Many Clouds, One API
With the recent rise in cloud computing, most cloud providers have offered their own APIs, which means cloud users sign up for the services of individual providers at the expense of being able to migrate easily to other providers at a later stage. Apache Deltacloud addresses this issue by offering a standardized API definition for infrastructure as a service (IaaS) clouds with drivers for a range of different clouds. The Deltacloud API is designed as a RESTful web service and comes with client libraries for all major programming languages. Additional drivers for accessing further public or private clouds can be created with minimal effort.
Despite plenty of discussion to the contrary, cloud computing still has very few standards. Instead, the emphasis is on development. Many different public clouds have been created over the last few years, including Amazon Elastic Compute Cloud (EC2), Amazon Simple Storage Service (S3), and JiffyBox. In the US, they are joined by providers such as GoGrid, Rackspace, and Terremark (Figure 1). Clouds are still a relatively new method for offering applications and services.
Providers, as well as suppliers of proprietary technologies, like to follow their own ideas on how a cloud and the applications within it should be accessed, operated, and managed. Business customers, on the other hand, are often reluctant to be tied to a single cloud provider. Interoperability, or the ability to migrate smoothly from one provider to another, is therefore becoming more and more important.
Although an impressive number of public clouds are on offer, many businesses choose to provide applications and computing services in their own private clouds using their internal LAN/WANs. Once the internal clouds reach their limits, it would be highly advantageous if public clouds could then be accessed for additional resources. Linking these two spheres, however, has long been fraught with difficulties.
A consistent, standardized cloud computing API model would solve the problem of software developers having to do further programming whenever a new cloud with a proprietary API is introduced. Even with Amazon’s cloud projects such as EC2 and S3 providing fully functional APIs for their own activities, they are in no way a workable solution for other public clouds. In fall 2009, Red Hat decided to address this situation by creating Deltacloud, an open source project that defines a standardized interface and includes adapters for all the major public and private clouds. Support for new clouds does not need to be handled by the applications. Deltacloud takes care of it directly.
Although Deltacloud was originally a Red Hat project, in spring 2010 the interface and all related code were transferred to the Apache Software Foundation’s Incubator (http://incubator.apache.org/deltacloud/) for further development. This new basis ensures ongoing vendor independence. Like other Apache projects, Deltacloud is being developed jointly by many participants from different businesses and organizations that are committed to the principles of open software licensing and user-driven innovation.
REST-Based Interface
The Deltacloud API is implemented via HTTP as a service-based REST (representational state transfer) interface (Figure 2). All data is communicated via the REST interface to a Deltacloud server, which similarly has a REST interface. To simplify operation of the REST interface, the Deltacloud project provides a CLI (command-line interface) tool, as well as client libraries in Ruby, Java, C, and Python.
Deltacloud is not the only open source project to develop cloud abstraction APIs; other solutions include jclouds, libcloud, boto, and fog. However, all these libraries are tied to specific programming languages – jclouds to Java, libcloud and boto to Python, and fog to Ruby. These options all consist of language-specific libraries. Deltacloud, on the other hand, is entirely independent of languages and is the only cloud abstraction API that can also be used as a web service. The advantage of this approach is that through the use of widely accepted and existing standards such as HTTP and XML, an open architecture is created independent of platforms and programming languages.
The conception of the Deltacloud API as a web service instead of a library makes it possible to operate the Deltacloud server in one of two base configurations – close to the user, such as on a local computer/LAN, or close to the cloud provider’s native API, which is particularly interesting for private clouds. Of course, providers can also use Deltacloud directly as their sole interface.
The Deltacloud server was written in Ruby or, more specifically, using the Ruby framework Sinatra. Internally, the server code comprises two main sections. The first, generic section is dedicated to typical web service tasks such as receiving and deserializing HTTP requests and formatting responses. The second section provides drivers for individual clouds such as Amazon EC2, vCloud, Azure, and so on. The two sections are linked through a simple internal interface. This makes it possible to create drivers for new cloud APIs without having to engage much with the server code. Experience has shown that a new driver can usually be implemented within a few days.
In simplified terms, a process flow typically begins with a client sending a request to the server via the REST interface. The Deltacloud server’s driver then relays the request to the dedicated cloud. The server itself remains stateless; it does not store any status or session details. Instead, the client sends the access data required for the selected cloud via the header for HTTP Basic authentication as part of every request. Whereas the cloud to be accessed and its API URL previously had to be specified during the Deltacloud server’s startup, the next version of Deltacloud will make it possible to select both via additional HTTP request headers. In this way, a single Deltacloud server can be used to address any number of clouds.
API Principles
The danger of defining abstraction APIs is that the abstract API only provides the relatively small number of functionalities shared by all the APIs it is based on. Deltacloud circumvents this issue by defining a basic interface that is supported by all the drivers and that also permits driver-specific extensions. The compatibility of these extensions with a specific driver can be detected via a range of simple mechanisms. These detection mechanisms are designed so that the client does not need to be aware of the cloud that it is connected to via Deltacloud; all questions pertaining to cloud-specific differences are resolved via detailed information in the responses from the Deltacloud server.
The API uses exactly one entry point, which is considered best practice in REST environments. The XML document accessed via the entry point’s URL contains detailed information about all the resources available through the Deltacloud server, including images, realms, hardware profiles, and so on. In addition to the URLs for each of these resource collections, the XML document includes information about driver-specific extensions. For example, clients are provided with information about whether the cloud permits user-defined data to be injected at startup into new VM instances, and if so, what mechanisms are used for this. This operation is crucial for personalizing new VM instances. Unfortunately, it is not supported by all cloud providers, and those that do support it, usually have their own unique methods.
The different states through which a virtual machine runs over its lifecycle vary significantly from cloud to cloud. Differences exist not only in the naming of logically equivalent states and operations, but also in the quantity and sequence of the states (Figure 3). Because of this, Deltacloud provides clients with a standardized model of the cloud-specific lifecycle, formatted as a finite state machine. In addition to smoothing out naming differences, a client can also detect which operations are necessary for, for example, pausing or removing a currently active VM.
Even when the general lifecycle of a cloud is known, it can be difficult to identify exactly which operations (e.g., Pause, Stop) are permitted on a virtual machine. A large number of attributes need to be taken into consideration, many of which exist only inside the cloud, such as permissions. Deltacloud simplifies this situation for clients by indicating for each VM exactly which operations it permits – for example, whether the current user is permitted to pause the VM.
Within a cloud, a “realm” describes a specified area that can access selected resources. Every cloud provider defines such areas individually. A realm can represent different data centers, regions, or even just resource pools within a single data center. Cloud providers can also stipulate conditions for realms – for example, that all resources to be used together need to be located within the same realm. Among others, this can apply to storage systems, which might only be linked to VM instances if they meet this condition.
Another area in which clouds tend to differ significantly is the size of the virtual machines made available. The options on offer vary not only in terms of number of virtual CPUs, memory size, and local storage, but also in terms of whether the user is restricted to fixed values and parameters when creating a virtual machine. For example, one provider might only offer virtual machines with a fixed 1GB of RAM, whereas another could allow the user to set up memory of between 1 and 8GB in increments of 512MB.
The Deltacloud API bundles all these different possibilities into hardware profiles, which means the clients are provided with a complete list of all possible VM values and user-definable parameters. In this way, clients do not need to know exactly how a specific cloud defines the size of a new VM; the only model they need to understand is the Deltacloud hardware profile.