Scale-out with PostgreSQL
Load Up
If you want to respond to higher load with more instances instead of a more powerful server, you typically need stateless applications. Horizontal scaling, whether in the cloud or with multiple servers, virtual machines, Kubernetes pods, or containers, creates the elasticity and resilience that business applications need – especially in the cloud, where every component needs to be replaceable in the event of failure and where elasticity ensures cost reductions without performance loss.
Hardly any application can do without a database to store data and ensure consistency, even with concurrent transactions. However, a database is likely to be stateful, and that presents a challenge when it comes to scaling. YugabyteDB is a cloud-native database, is fully open source, and aims to achieve PostgreSQL compatibility, which means the ability to run the same applications and tools with identical application behavior while handling complex online transactional processing.
Monolithic RDBMSs
You could ignore the modern challenges of relational database management systems (RDBMSs) if you just opened them on a single server. This restriction allows database processes simply to use the operating system's shared memory, which is protected by latches (spinlocks or mutexes).
Database users often think their database's shared memory pool is only useful as a cache for boosting performance, because it removes the need for I/O access to the data file in many cases. In fact, one of the main reasons for the existence of the shared memory pool is that it serves as a single source of truth when different sessions need read and write access to a given block of data without corrupting the block. However, a side benefit is that a larger shared memory pool also reduces read and write access to the hard disk for frequently used blocks. PostgreSQL in particular partially delegates this cache function to the Linux kernel, which is why the recommendation is to keep the shared memory pool smaller and leave more space for the filesystem buffers.
This mechanism limits the database to a single server that can share its RAM with all processes. One exception is Oracle's Real Applications Cluster (RAC), in which multiple RDBMS instances can open a database. However, this arrangement mandates complex intracluster synchronization, wherein the current version of a data block is only allowed to exist at a single location and therefore needs to be moved back and forth between physical machines. This global cache and lock management requires a network with exceptionally low latency, such as the InfiniBand network offered by Oracle's Exadata servers. On a conventional network, this setup will not work between different data centers or in the cloud. Additionally, RAC requires block storage that can be accessed by multiple virtual machines at the same time. This additional challenge in the cloud environment is unusual.
Over the years, monolithic RDBMSs have been beefed up with a variety of features to meet business needs or to make development and testing easier. Thanks to atomicity, consistency, isolation, and durability (ACID) guarantees, these applications deliver correct results even if race conditions or other issues occur. The key is SQL with its referential integrity (foreign keys) or unique constraints. What impresses most is that applications do not require any additional code. Once a level of isolation is established, anomalies that could result from concurrent access are prevented, again without additional code.
Another feature that is easy to implement in a monolithic architecture is the various indexes and their automatic maintenance, which means the same database can be used for multiple use cases with more flexibility as a bonus thanks to logical-physical independence. In short, the application queries the tables without worrying about the access path to the data. The database transparently and autonomously finds the optimal access path through the indexes.
Web Scale and NoSQL
Over the years, the pressure to serve the need for scaling across the board has become more intense. The reasons for this were the Internet, web-based business models, and the cloud. You cannot serve millions of users with a single database running at a single location somewhere in the world. This setup is impossible for three reasons.
First, the latency experienced by users when information is moved between different regions around the world is an unacceptable hundreds of milliseconds. No technical approach can alleviate this problem because information cannot be transmitted faster than the speed of light. The only option left is to keep the data in different locations near the users. Storing data at different locations can be a legal requirement, as well.
Second, the network between geographic regions is the Internet, where packets are routed through an infrastructure in which performance bottlenecks and errors can occur at any time. If the database only ran in one place, the application would become unavailable whenever the network went down. All popular RDBMSs implement active-passive replication and provide read replicas that can be used to output reports, even if the primary database is down. However, this is not equivalent to scaling across different nodes and implementing consistent cross-node read and write transactions.
Third, some workloads need high throughput with predictable response times even under peak load (e.g., Black Friday trading). The use of more CPUs and more RAM in a server (scale-up) requires downtime and always has limits. To ensure the availability of the application in all situations, the database needs to run on several servers, the number of which can be increased as needed and decreased later to save money. At the same time, this arrangement allows for updates during operation.
This need to scale out has spawned a new type of database that partitions data into shards, which it then stores on multiple servers. One solution has been to run transactions essentially in the way they were run before the advent of relational databases: with a hierarchical and distributed structure that does not allow for complex transactions or guarantees of consistency and makes do with very simple storage primitives.
NoSQL databases do precisely this. They are similar to datastores before SQL was invented but are distributed across a resilient and elastic network of nodes. NoSQL is great for certain tasks for which the access patterns are known in advance. Some companies have migrated their data to these kinds of databases. In turn, however, it forces additional code on the application to replace the loss of SQL features. Of course, until some time ago, this was the only way to scale out.
NewSQL
The need to scale out arose for all kinds of workloads, including complex online transactional processing (OLTP); developers still wanted all the SQL features they enjoyed in the RDBMSs used previously but now with all the benefits of the NoSQL scale-out architecture. As a result, the popular SQL came back in a form called NewSQL.
Data warehouses and analytical applications simply used NoSQL's sharding technique to split their data, and each shard was its own RDBMS. However, this approach does not allow for transactions beyond the boundaries of a shard, no global indexes, no unique constraints, no foreign keys between shards, and no joins between partitions. This arrangement is acceptable for analytical applications, but OLTP applications need global transactions.
Distributed SQL databases are the latest development in this respect for the time being. They provide all the features of SQL, including cross-node transactions, foreign keys, and unique constraints in a scale-out architecture. Google's Cloud Spanner, CockroachDB, and TiDB belong to this group. Cloud Spanner provides SQL much like PostgreSQL, whereas TiDB resembles MySQL.
Vendor Yugabyte takes things a step further with YugabyteDB by reusing PostgreSQL to provide a full set of SQL features, then building the whole thing on a Spanner-style distributed engine foundation. In this way, the user gets the best of both worlds: You can connect to any node and always have read and write access to a single logical database (Figure 1).
Buy this article as PDF
(incl. VAT)