« Previous 1 2 3 Next »
New storage classes for Amazon S3
Class Society
S3 with Intelligent Gradation
The new Intelligent-Tiering memory class is primarily designed to optimize costs. This approach works because AWS continuously analyzes the data for access patterns and automatically transfers the results to the most cost-effective access level. The two target memory classes involved in intelligent tiering are Standard and Standard-IA. As you may know, storage is cheaper in Standard-IA, but retrieval is more expensive. Although retrieval is possible at any time with the same access time and latency, the AWS pricing for this memory class stipulates that the objects are rarely read after the initial write.
For this automation, however, AWS charges an additional monthly monitoring and automation fee per object. Specifically, S3 Intelligent-Tiering monitors the access patterns of the objects and moves objects that have not been accessed for 30 days in succession to Standard-IA. If an object in Standard-IA is accessed, AWS automatically moves it back to S3 Standard. There are no retrieval fees when using S3 Intelligent-Tiering and no additional grading fees are charged when objects switch between access levels. This makes the class particularly suitable for long-term data with initially unknown or unpredictable access patterns.
Assigning Storage Classes
S3 storage classes are generally configured and applied at the object level, so the same bucket can contain objects stored in S3 Standard, S3 Standard-IA, S3 Intelligent-Tiering, or S3 One Zone-IA. Glacier Deep Archive, on the other hand, is a service in its own right. Users can upload objects to the storage class of their choice at any time or use S3 lifecycle guidelines to transfer objects from S3 Standard and S3 Standard-IA to S3 Intelligent-Tiering. For example, if the user uploads a new object into a bucket via S3 GUI, they can simply select the desired storage class with the mouse (Figure 1).
When uploading from the CLI, the memory class is given as a parameter, --storage-class
. The values STANDARD
, REDUCED_REDUNDANCY
, STANDARD_IA
, ONEZONE_IA
, INTELLIGENT_TIERING
, GLACIER
, and DEEP_ARCHIVE
are permitted, for example:
aws s3 cp s3://mybucket/Test.txt s3://mybucket2/ --storage-class STANDARD_IA
The same idea applies when using the REST API. Remember that Amazon S3 is a REST service. Users can send requests to Amazon S3 either directly through the REST API or, to simplify programming, by way of wrapper libraries for the respective AWS SDKs that encapsulate the underlying Amazon S3 REST API.
Therefore, users can send REST requests either in the context of the desired SDK or directly, where Amazon S3 uses the default storage class for storing newly created objects without explicitly specifying the storage class with --storage-class
.
Listing 1 shows an example of updating the memory class for an existing object in Java. Another example for uploading an object in Python 3 (with Boto3 SDK) to the infrequently accessed storage class (Standard IA) would look like:
Listing 1
Updating the Storage Class
01 AmazonS3Client s3Client = (AmazonS3Client)AmazonS3ClientBuilder.standard().withRegion(clientRegion).withCredentials(new ProfileCredentialsProvider()).build(); 02 CopyObjectRequest copyRequest = new CopyObjectRequest(sourceBucketName, sourceKey, destinationBucketName, destinationKey).withStorageClass(StorageClass.ReducedRedundancy); 03 s3Client.copyObject(copyRequest);
import boto3 s3 = boto3.resource('s3') s3.Object ('mybucket', 'hello.txt').put(Body=open('/tmp/hello.txt', 'rb'), StorageClass='STANDARD_IA')
Cheap Storage with Glacier Deep Archives
Of the six main storage classes mentioned, only four can be queried directly at any time because the Glacier and Glacier Deep Archive classes are not applied to the S3 service with its concept of buckets and objects; instead, it is applied to Amazon's Glacier archive service, which offers the same stability as archive storage. However, the retrieval times are configurable between a few minutes and several hours. Immediate retrieval is not possible. Instead, users need to post a retrieval order by way of the API, which eventually returns an archive from a vault. Like S3, the Glacier API is natively supported by numerous third-party applications.
However, one special feature of S3 and Glacier is that the archive service can also be controlled with a lifecycle guideline from the S3 API (Figure 2). Glacier can therefore be controlled either with the Glacier API or the S3 API. In the context of S3 lifecycle policies, for example, it has long been possible to transfer documents that are no longer read after a certain period of time but must be retained for a specific period because of corporate compliance guidelines to Glacier after the desired retention period in S3 Standard-IA.
Glacier Deep Archive has only been available as a storage class for S3 since early 2019. Since then, users have been able to archive objects from S3 Standard, S3 Standard-IA, or S3 Intelligent-Tiering not only in S3 Glacier, but also in Glacier Deep Archive. Although storage in Glacier is already three times cheaper than in S3 Standard, at less than a half a cent per gigabyte, the price in Glacier Deep Archive again drops to $0.00099/GB per month.
This pricing should make Glacier Deep Archive the preferred storage class for all companies needing to maintain persistent archive copies of data that virtually never need to be accessed. It could also make local tape libraries superfluous for many users. The low price of the archive comes at the price of extended access time. Accessing data stored in S3 Glacier Deep archives requires a delivery time of 12 hours compared with a few minutes (Expedited) or five hours (Standard) in Glacier.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)