Why databases are moving to the cloud
Flexibility
Opinions are divided into two major camps as to what constitutes a cloud database. The first camp claims they are databases provided by hyperscalers such as Amazon, Azure, and Google under the database as a service (DBaaS) model. The other, more common, view is that they can be any database that someone deploys within any cloud infrastructure, whether it be a private cloud, public cloud, hybrid cloud, multicloud, or hosted DBaaS. It's just a matter of where the data is stored and how it is accessed. In the case of on-premises databases, access is over the corporate network (and VPN, if necessary) to internal database servers, whereas cloud databases will rely on Internet connections to external (in some cases internal) cloud servers.
Deployment Models
Generally speaking, cloud databases have three popular deployment models. In the first case, the cloud database is operated in-house, either in the company's own data cloud or a public cloud. In the second case, a hoster provides DBaaS. The third variant involves consuming the database as a managed service.
In the first scenario, the cloud database runs on an internal or external virtual machine (VM), but operations are handled by the company's own database administrators. In contrast, DBaaS involves the hoster or provider provisioning the database and hardware infrastructure as part of a subscription agreement, depending on the service level agreement (SLA). You are still responsible for ongoing operation of the database, though. If you want the provider to handle database operations, taking the load off the internal IT administrator's shoulders, the third option is a database as a managed service, or managed hosting.
The deployment model has no influence on the database type. Both relational SQL and document-oriented NoSQL databases can be operated as cloud databases. The different levels of flexibility make an important difference, though. SQL databases rely on fixed table structures. Changes to this row-column scheme cause major overhead, if they are possible at all. The younger, cloud-native NoSQL databases, on the other hand, have a far more flexible data model based on JSON documents instead of tables that can be changed faster and with less overhead.
When people talk about cloud databases, they often overlook the fact that using them only really makes sense if both the applications and the data are migrated to the cloud. Put simply, a cloud database only makes sense if the cloud applications also use the data stored there. NoSQL databases cope better with the greater volatility of cloud apps than traditional SQL databases. Their operating principle dates back to a time when clouds were still the exclusive domain of meteorologists.
Advantages
Three major features argue for the use of cloud databases: scalability (Figure 1), fast and secure availability, and (purportedly) lower costs.
Extending databases that you run in your own IT infrastructure requires investing a large amount of cash in additional hardware and having at least the same amount of patience until everything is installed and running. Cloud databases, on the other hand, scale more-or-less in real time and in both directions. They are highly flexible and can adapt quickly to changing requirements, which means a sales slump can be cushioned just as quickly as a sudden boom in demand, without expensive hardware lying idle or having to be ordered hastily. Moreover, you can do without tricks like thin provisioning, which translates to rapid availability of database resources that depend on current demand. Long-term resource planning and preemptive, and costly, hardware (over)dimensioning are a thing of the past.
The demands on the IT infrastructure and administration drop at the same time. Depending on the design, expensive hardware installations can be almost completely eliminated, and the space required for them is reduced accordingly. On top of this, managed hosting of cloud databases means you no longer need a dedicated administrator to operate, support, and maintain the database, because it is handled by the database operator. Given the tight staffing situation in this area, this consideration is quite significant. One important advantage is simple handling of cloud databases, which can be controlled and used over a web interface.
On the other hand, you do need to invest in fast and stable broadband connections when operating cloud databases. The costs for this are usually worthwhile compared with the potential savings as just described. However, some caution is advisable when it comes to the cost of database usage. In many cases, only the standard hyperscaler functions are genuinely inexpensive. High surcharges will often apply for special cases or individual requirements.
The pay-per-use or pay-as-you-go options, wherein only the database resources you actually use and consume are billed, is attractive. Of note is that investments in hardware and software in many countries is depreciated over several months or years for tax purposes, whereas subscription costs for cloud databases generally can be recovered directly as tax-reducing operating costs.
Migration and Security
Anyone who wants to transfer their data – or even just part of it – to the cloud faces the often underestimated problem of cloud migration. The greater the volume of data to be migrated to the cloud, the higher the costs. It makes sense to clarify in advance what data you still need and what is dispensable or even superfluous and, in turn, a potential security risk. Other security and compliance issues are related to confidential information, such as the personal data of your customers, contractors, and employees.
Data cleansing and analysis is a good idea before migrating to the cloud. Data cleansing is primarily about removing redundant, obsolete, and trivial (ROT) data. The aim of data analysis is to clarify which of the remaining data should be allowed to move to the cloud and which should remain in-house for security reasons or issues with particularly sensitive intellectual property. In the course of cloud preparation, you might want to put an end to an undesirable spread of databases in the company. Before moving to the cloud, weigh up the extent to which it makes sense to consolidate databases into a unified database management system (DBMS).
When it comes to security, cloud providers go the extra mile and, in their own interest, ensure a general level of security that is often higher than the individually installed protection mechanisms in private clouds or in-house data centers. Even database-critical issues such as failover and backup are handled by the provider in the case of DBaaS and managed services. Fail-safes and high availability of the cloud databases are implemented in the availability zones in which database clusters are formed. To do this, the cloud database needs appropriate replication mechanisms (e.g., cross-data center replication (XDCR)) that constantly replicate between clusters. Alternatively, you can fall back on cloud resources with a redundant design in the background.
One commonly reported problem with cloud databases is speed, which does not primarily depend on the database itself; instead, it is a function of the general performance of the virtualized environment. The bandwidth is high, but generally does not achieve the performance of a bare-metal installation, even in the best of clouds. In contrast, latency times in cloud databases, attributable to the centralized cloud architecture, can be critical. All data first needs be transferred to one or more data centers – usually far away – and then returned from there. This process takes more time than accessing or transferring data in internal database servers. This limitation above all affects real-time applications, such as those typical for scenarios in the Internet of Things (IoT) environment or in the Industrial Internet of Things (IIoT).
Buy this article as PDF
(incl. VAT)