Hardware suitable for cloud environments

Key to Success

DRBD with Higher Performance Requirements

If you consider DRBD to be an alternative to Ceph, you need fundamentally different hardware. DRBD is far less complex at its core than Ceph; the focus is clearly on copying data back and forth between two block storage devices at the block level of the Linux kernel.

Both in terms of CPU and RAM, DRBD systems are therefore far more frugal than the corresponding servers for Ceph. In terms of storage performance, however, they have far higher requirements for the individual devices than Ceph, because DRBD, unlike Ceph, does not distribute its writes across many spindles throughout the system.

You can intervene to improve performance by putting the DRBD activity log on fast storage. The journal receives each write and then writes it out to the storage backend. If NVMe is used, it makes a noticeable difference. Even battery-supported buffer storage does not put DRBD out of step.

Compute: The Art of Costing

Once you have made it this far, you are facing the last piece of the hardware puzzle: the hardware for the compute nodes. Although compute is only one of the two services that cloud environments typically provide (the other being storage), it is also one of the two services that cloud environments need to provide. However, an Amazon Simple Storage Service (S3)-compatible interface is now almost always included in the scope of delivery of standard object storage devices, so that Ceph, for example, simply handles the storage role itself. In any case, compute is the most frequently requested service.

The question as to which hardware is suitable for compute is not easily answered. Several factors play an important role. At least in the public cloud context, the motto is: Bigger is always better. Platforms of this kind can only be operated profitably if they grow quickly. However, profit does not lie in the combination of storage and compute, as many admins suspect.

In large Ceph clusters, a single gigabyte costs well under a cent to manufacture, and even with terabyte data usage and high coverage, the customer pays very little for storage. The decisive factor for the admin is instead the selling price of a single virtual CPU (vCPU) less the production costs incurred for it. The company uses this factor to scale its turnover with the platform, and thus ultimately its own profit.

So the question as to which hardware is suitable for compute is directly linked to the price of a single vCPU, which in turn is determined by the number of available CPUs per comparison unit. Although this sounds complicated in theory, it is actually quite simple to calculate. To work things out, you need to fire up your favorite spreadsheet.

Assume a complete rack with 42 height units, of which at least two units are reserved for the necessary network hardware. Most data centers also have a maximum amount of power per rack – that means 16 to 18 servers, each of two height units, can usually be accommodated in such a rack.

First, you have the task of recording all the costs per month, ideally for a period of five years. These costs include power, air conditioning, and normal maintenance, as well as the cost of rack installation, estimated hardware repair costs, and any support costs. In the end you have a list of all costs, broken down by month. From this, in turn, the total monthly costs for two 1Us in the rack can be derived (Figure 4).

Figure 4: Servers such as Dell's R740 are suitable as 2U devices for both storage and compute servers. © Dell

Overcommit is Standard

The next step is to find out how much cash a single vCPU has to generate to cover expenses and make a profit. To do this, simply multiply the actual number of cores available on the server by an overcommit factor (e.g., 4) to obtain the number of vCPUs that can be sold for every two-height units. In the final step, you need to determine adequate coverage per vCPU, which finally gives you the selling price.

It is important always to plan a safety reserve of 20 percent per rack across the entire platform, which also includes typical waste. On top of that, a rack does not earn money from day 1, so the cost per vCPU has to be calculated so that the rack has generated its planned profit by the end of its service life.

Speaking of overcommit: Many admins experience sleepless nights because of the basically standard approach in large virtualization environments: It is virtually never the case that a specific workload actually uses 100 percent of its own vCPUs on a permanent basis, and if it does, it does not do so for all virtual machines (VMs) per physical server simultaneously. Without an overcommit factor, a large amount of hardware per rack would therefore be undertasked.

By the way, the costing described here makes one thing very clear: The more vCPUs available for every two-height unit, the greater the profit you make with the rack over its service life. In other words, the more vCPUs that can be squeezed into the same area, the better it is from a commercial perspective.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Live migration of virtual machines
    A big advantage in virtualization is the ability to move systems from one host to another without exposing the user to a long period of downtime. To that end, the hypervisor and storage component need to cooperate.
  • The RADOS Object Store and Ceph Filesystem

    Scalable storage is a key component in cloud environments. RADOS and Ceph enter the field, promising to support seamlessly scalable storage.

  • Exploring Apache CloudStack
    Apache's CloudStack offers flexibility and some powerful networking features.
  • Live Migration

    A big advantage in virtualization is the ability to move systems from one host to another without exposing the user to a long period of downtime. To that end, the hypervisor and storage component need to cooperate.

  • Ceph and OpenStack Join Forces

    When building cloud environments, you need more than just a scalable infrastructure; you also need a high-performance storage component. We look at Ceph, a distributed object store and filesystem that pairs well in the cloud with OpenStack.

comments powered by Disqus