AppScale AWS clone for private clouds
Replica
In early days, the public cloud – and especially AWS – was a good thing. People were able to shift large parts of their own IT equipment fleet to the cloud and seemingly save large amounts of cash because they no longer had to operate their own IT infrastructure. In the meantime, the tide has turned. As the workload running in the cloud has grown, the monthly bills from AWS have become horrendous, and because the company's entire setup is now completely oriented on AWS and its various services, it is no longer possible to roll back.
At the same time, many of the hyperscalers' promises turned out to be more or less irrelevant in day-to-day operations. It's a cool fact that you can scale a web server setup from 10 to 1,000 nodes in next to no time, but it doesn't actually happen in reality, especially because many companies have not rebuilt their applications to take advantage of this level of scalability. In other words, you are paying for the kind of flexibility that you can hardly use in a meaningful way from the outset.
However, maybe there is something your can do: The AppScale suite lets you build a private cloud with the promise of being API-compatible with AWS, at least in terms of key features. At the same time, the setup is easy to build, which the vendor believes puts it ahead of other solutions like OpenStack. Under the hood, however, AppScale is genuinely open source software. For example, the Ceph object store is used as the storage stack, which is reason enough to take a closer look at the solution. What can AppScale do, how easy is it to set up, and what costs are you looking at?
A Question of Money
Before I get into the technical details, it's a good idea to look at the financial side of AWS and a move back to your own physical hardware in more detail. Solutions like AppScale have one obvious weakness: They presuppose that you operate your own infrastructure, which is undoubtedly even more expensive today than it was in the first rush to migrate to the public cloud a few years ago. Some administrators might therefore be inclined to dismiss immediately the idea of their own on-premises cloud with AWS APIs. However, more accurate costing at this point might well be worth your while.
What are the price drivers for hyperscalers and AWS in particular? At its core, AWS charges for two types of services. Services that implement classic infrastructure as a service (IaaS) in AWS with Elastic Compute Cloud (EC2) – and that's still an important share of AWS users all told – are essentially paying to provision virtual instances, which are not as cheap as some administrators might think at first glance. If you need a fair amount of RAM and fast CPUs, you'll pay EUR700 ($700) or more per month for an instance.
This example already gives rise to doubts about purportedly favorable AWS prices. For just south of EUR20,000, and that's what a powerful virtual machine will cost over two years, you can pick up a very respectable physical machine from, for example, Supermicro. Moreover, bare metal usually amortizes in five to seven years; in other words, it's in use for far longer. A large proportion of the costs is generated by local operations – if not on the hardware side, then from power, ventilation, and everything that goes with it. Even with AppScale, you can do little about this price.
The difference emerges when you look at the second financial factor of public cloud usage. AWS and its ilk like to lure customers with as-a-service offerings (i.e., managed databases, managed DNS services, etc.). Often enough, however, billing is based on traffic volumes or access counts. The problem then is that it is virtually impossible to make reliable forecasts.
The previously mentioned scaling factor plays a significant role, too. Of course, services like Amazon's relational database can be scaled almost arbitrarily at the push of a button. However, many companies do not have such large fluctuations in the use of their digital services for constant and dynamic scaling to make monetary sense. In other words, companies are paying for a technical option that they are not even using.
Either way, when running infrastructure in your own data center, traffic and call costs do not scale in a linear way with requests. The factor of billed consumption is totally relevant when you compare the public cloud and a local setup. In some circumstances, significant savings can be realized by directly comparing private and public clouds.
AppScale primarily supports you by improving your ability to plan the costs of operating an in-house infrastructure – the entire expense of IT operations, like back in the day. This situation is not automatically cheaper, and you need to do some serious costing up front, but if the costs for service calls turn out to be the main cost driver, AppScale may well be a valid alternative and offer you cost savings in IT.
Turbulent History
AppScale hasn't been around too long in its current form. The project was originally planned as a replica of the Google Cloud Platform (GCP). However, this was at a time when the big pie of public cloud computing had not yet been distributed, and AppScale discovered that it was backing the wrong horse. It was not GCP that won the big contracts in the pioneering phase of cloud computing, but Amazon. The idea of producing software that was API-compatible with GCP turned out to be a flop in this respect, but that didn't stop the team of AppScale developers over 15 years ago; they soon started working on an EC2 clone, which essentially means the IaaS function of AWS.
Some readers might even recall AppScale in its former state, because the software was also covered by ADMIN [1] under its former name, Eucalyptus. The solution disappeared from the scene for a while. At first, the developers sold their software to HP, probably in the hope that the money earned there would let them to shake up the market, but then HP stumbled in its restructuring campaigns and soon lost interest in the private cloud altogether. The company not only unceremoniously scrapped its OpenStack commitment, but also let the Eucalyptus development become completely dormant. A small group centered around the original creators of the software finally started working on a fork of the final version of Eucalyptus released by HP under a free license. Since then, this group has been operating under the AppScale name.
The catch is that although Eucalyptus was languishing under HP's wing, the AppScale developers were working on a tool of the same name for deploying and checking the compliance of resources on Google Cloud. Because the Internet does not forget, the other AppScale can still be found in the search results today if you simply search for the term. If you want to find more information about the solution after reading this article, you always need to include an additional term such as "AWS." The deployment solution for GCP has since disappeared into oblivion.
Well-Known Architecture
To avoid misunderstandings, the types of services that a private cloud needs to offer and how these will be implemented by components have long been demonstrated by projects such as OpenStack. AppScale's architectural underpinnings (Figure 1) are not much of a surprise; many of the software's services and principles will look familiar to admins of private clouds.
Basically, AppScale works like this: A cloud controller acts as the solution's central hub and is responsible for managing resources across all areas of the cloud. As a cluster, it comprises several services that look like a single unit to the outside world. The cluster controllers on a lower level are responsible for managing the AppScale services in a region, such as an availability zone.
The cluster controllers are each supported by a storage controller. Its task is to provision the local storage and connect it so that the virtual instances on the node controllers can use it. The number of node controllers per zone is practically unlimited. Accordingly, AppScale envisages the use of scalable storage: Ceph, as mentioned earlier. The rule is that each availability zone has its own Ceph cluster for block device storage. At the cloud control plane level, another Ceph cluster is added to provide object storage with a Simple Storage Service (S3)-compatible API.
The control plane also includes the core of the software: the services that emulate various AWS APIs. Besides the AWS APIs for S3 and EC2, these services currently include Identity and Access Management (IAM), CloudFormation, CloudWatch, Elastic Load Balancing (ELB), and the security token service (STS). Moreover, AppScale includes a web-based console as a kind of graphical user interface (GUI). By nature, however, the focus when using cloud platforms tends to be on the APIs.
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.