Hardware suitable for cloud environments
Key to Success
Storage Questions: NAS, SAN, SDS
Once you have planned the network and procured the appropriate hardware, you can move on to the next important topic: storage. Here too, it is often necessary to say goodbye to cherished conventions. The idea of buying a classic network-attached storage (NAS) or storage area network (SAN) from a manufacturer and connecting it to the cloud over Ethernet is undoubtedly obvious. Today, such devices are also considerably cheaper than just a few years ago, because several manufacturers are trying to gain a foothold in this market by means of an aggressive pricing policy. Despite all the progress made, central network storage still has enormous design problems. However redundant they may be internally, they always form a single point of failure, which becomes a problem, at the latest when the power in the data center is switched off because of a fire or similar disaster.
Even worse, these devices do not allow horizontal scaling in a meaningful way. When a platform scales across the board, its central infrastructure must grow with it. For OpenStack or other compute management software, NAS and SAN systems are therefore hardly a sensible option. Moreover, by buying the appropriate device, you expose yourself to vendor lock-in. Storage is almost the same as SDN: Once a structure is in production, a component such as the storage underlying it can hardly be removed during operation and replaced by another product. After all, in most clouds you can operate several storage solutions in parallel. In your own interest, however, you will want to save yourself the effort of copying data from one storage device to another during operation.
Additionally, solutions for software-defined storage (SDS), even in the open source world, are certainly possible. Speaking of object storage, Ceph is the industry leader by a country mile, although there are other approaches, as well, such as DRBD by Linbit. Although it began its career as a replication solution for two systems, it now scales across the board. DRBD dynamically creates local volumes of the desired size on a network of servers with storage devices and then automatically configures replication between these resources.
Flash or Not Flash, That is the Question
No matter the kind of storage solution the admin decides on, one central question is almost always asked: Do you go for slow, hard disks, fast SSDs, or a mixture of the two? This debate only makes sense in the first place because flash-based memory, essentially solid-state drives (SSDs) and NVMe devices, has become significantly cheaper in recent years and is now available in acceptable sizes.
A fleet of 8TB SSDs, for example, costs around $1,200 (EUR2,000) on the free market. The price can undoubtedly be reduced considerably if you buy 30 or 40 units from your trusted hardware dealer. In any case, it is worth calculating the price difference between disks and SSDs.
As an example, assume a Ceph setup that initially comprises six servers with a gross capacity of 450TB. The effectively usable storage capacity is thus 150TB. Each individual node must therefore contribute 75TB to the total storage capacity. A good Western Digital hard drive server costs about $260 (EUR250), and 10 of them are required for each server. If you add SSDs to the hard disks as fast cache for your own write-ahead log (WAL) and Ceph's metadata database, the storage devices cost around $3,800 (EUR3,500) per server. Additionally, you need a 2U server for about $6,500 (EUR6,000). Roughly calculated, a Ceph server like this would cost about $13K (EUR10K), with an entire cluster for around $100K (EUR80K).
The same cluster with SSDs costs significantly more on paper. Here, the storage devices cost about $20K (EUR18K) per system, so that, in addition to the server itself, it comes in around $32K (EUR24K) per machine and a total of $240K (EUR192K), bottom line. In return, however, the SSD-based Ceph cluster is seriously fast, gets by without caching hacks, and offers all the other advantages of SSDs (Figure 3). The additional expenditure of approximately $140K (EUR110K) quickly pays dividends, if you consider the total proceeds of a rack over the duration of its lifetime. With several million dollars of turnover, which a rack achieves given a good level of utilization, the extra $140K no longer looks like such a bad deal.
Storage Hardware Can Be Tricky
If you decide on a scalable DIY storage solution, you purchase the components to reflect your specific needs. The examples of Ceph and DRBD illustrate this well.
I already looked at what the hardware for a Ceph cluster can look like, in principle. The number of object-based storage devices (OSDs) per host should not exceed 10; otherwise, if one Ceph node fails, significant resynchronization traffic would disrupt regular operations or drag on forever if you slow it down accordingly. The trick with SSDs, which are given to the OSDs as fast cache in Ceph, is well known – and it still works.
Ceph and RAID controllers don't get on very well in a team. Therefore, if RAID controllers cannot be replaced by simple host bus adapters (HBAs), they should be operated in HBA mode. Battery-backed caches are more likely to prove a hindrance in the worst case scenario because they can cause unforeseen problems in Ceph and slow it down.
With regard to CPU and RAM, the rule of thumb is still valid today that Ceph requires one CPU core per OSD and 1GB of RAM per terabyte of memory offered. For a node with 10 OSDs, a medium-sized multicore CPU in combination with 128GB of RAM is sufficient. Today, these values are at the lower limits of the scale for most hardware manufacturers.
Buy this article as PDF
(incl. VAT)