Lead Image © zelfit, 123RF.com

Lead Image © zelfit, 123RF.com

MinIO: Amazon S3 competition

Premium Storage

Article from ADMIN 61/2021
By
MinIO promises no less than a local object store with a world-class S3 interface and features that even the original lacks.

The Amazon Simple Storage Service (S3) protocol has had astonishing development in recent years. Originally, Amazon regarded the tool merely as a means to store arbitrary files online with a standardized protocol, but today, S3 plays an important role in the Amazon tool world as a central service. Little wonder that many look-alikes have cropped up. Ceph, the free object store, for example, has offered a free alternative for years by way of the Ceph Object Gateway, which can handle both the OpenStack Swift protocol and Amazon S3.

MinIO is now following suit: It promises a local S3 instance that is largely compatible with the Amazon S3 implementation. MinIO even claims to offer functions that are not found in the original. The provider, MinIO Inc. [1], is not sparing when it comes to eloquent statements, such as "world-leading," or even "industry standard." Moreover, it's completely open source, which is reason enough to investigate the product. How does it work under the hood? What functions does it offer? How does it position itself compared with similar solutions? What does Amazon have to say?

How MinIO Works

MinIO is available under the free Apache license. You can download the product directly from the vendor's GitHub directory [2]. MinIO is written entirely in Go, which keeps the number of dependencies to be resolved to a minimum.

MinIO Inc. itself also offers several other options to help admins install MinIO on their systems [3]. The ready-to-use Docker container, for example, is particularly practical for getting started in very little time. However, if you want to do without containers, you will also find an installation script on the provider's website that downloads and launches the required programs.

What looks simple and clear-cut from the outside comprises several layers under the hood. You can't see them because MinIO comes as a single big Go binary, but it makes sense to dig down into the individual layers.

Three Layers

Internally, MinIO is defined as an arbitrary number of nodes with MinIO services that are divided into three layers. The lowest layer is the storage layer. Even an object store needs access to physical disk space. Of course, MinIO needs block storage devices for its data, and it is the admin's job to provide them. Like other solutions (e.g., Ceph), MinIO takes care of its redundancy setup itself (Figure 1).

Figure 1: The MinIO architecture is distributed and implicitly redundant. It also supports encryption at rest. ©MinIO [4]

The object store is based on the storage layer. MinIO views every file uploaded to the store as a binary object. Incoming binary objects, either from the client side or from other mini-instances of the same cluster, first end up in a cache. In the cache, a decision is made as to what happens to the objects. MinIO passes them on to the storage layer in either compressed or encrypted form.

Implicit Redundancy

Redundancy at the object store level is a good thing these days in a distributed solution like MinIO; hard drives are by far the most fragile components in IT. SSDs are hardly better: Contrary to popular opinion, they break down more often than hard drives, but at least they do it more predictably.

To manage dying block storage devices, MinIO implements internal replication between instances of a MinIO installation. The object layers speak their own protocol via a RESTful API. Notably, MinIO uses erasure coding instead of the classic one-to-one replication used by Ceph. This coding has advantages and disadvantages: The disk space required for replicas of an object in one-to-one replication are dictated by the object size. If you operate a cluster with a total of three replicas, each object exists three times, and the net capacity of the system is reduced by two thirds. This principle of operation has basically remained the same since RAID1, even with only one copy of the data.

Erasure coding works differently: It distributes parity data of individual block devices across all other block devices. The devices do not contain the complete binary objects, but the objects can be calculated from the parity data at any time, reducing the amount of storage space required in the system. For every 100TB of user data, almost 40TB of parity data is required, whereas 300TB of disk space would be needed for classic full replication.

On the other side of the coin, when resynchronization becomes necessary (i.e., if single block storage devices or whole nodes fail), considerable computational effort is needed to restore the objects from the parity data. Whereas objects are only copied back and forth during resyncing in the classic full replication, the individual mini-nodes need a while to catch up with erasure coding. If you use MinIO, make sure you use correspondingly powerful CPUs for the environment.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus