Amazon Glacier is designed for storing data that is infrequently accessed. Once you have stored your data, you can retrieve up to 5% of it (prorated daily) each month at no charge.
Today we are making it easier for you to remain within the 5% retrieval band by introducing Range Retrievals. You can use this new feature to fetch only data you need from a larger file or to spread the retrieval of a large archive over a longer period of time.
Range Retrieval Glacier's existing archive retrieval function now accepts an optional RetrievalByteRange parameter. If you don't provide this header, Glacier will retrieve the entire archive.
If you choose to provide this parameter, it must be in the form StartByte-EndByte. The value provided for StartByte must be megabyte aligned (a multiple of 1,048,576). The value provided for EndByte + 1 must be megabyte aligned if you are retrieving data from somewhere within the archive. If you want to retrieve data from StartByte up to the end of the archive, simply specify a value that is one less than the archive size.
When you upload data to Glacier, you must also compute and supply a tree hash. Glacier checks the hash against the data to ensure that it has not been altered en route. A tree hash is generated by computing a hash for each megabyte-sized segment of the data, and then combining the hashes in tree fashion to represent ever-growing adjacent segments of the data.
If you would like to use tree hashes to confirm the integrity of the data that you download from Glacier (and you definitely should), then the range that you specify must also be tree-hash aligned. In other words, a tree hash must exist (at some level of the tree of hashes) for the exact range of bytes retrieved. If you specify such a range, Glacier will provide you with the corresponding tree hash when the retrieval job completes.
This new feature is available now and you can start using it today. The AWS SDK for Java and the AWS SDK for .Net have been updated and now include support for Range Retrievals.
For More Information Here are some quick links that you can use to learn more about Range Retrievals in Glacier:
AWS provides you with a number of data storage options. Today I would like to focus on Amazon S3 and Amazon Glacier and a new and powerful way for you to use both of them together.
Both of the services offer dependable and highly durable storage for the Internet. Amazon S3 was designed for rapid retrieval. Glacier, in contrast, trades off retrieval time for cost, providing storage for as little at $0.01 per Gigabyte per month while retrieving data within three to five hours.
How would you like to have the best of both worlds? How about rapid retrieval of fresh data stored in S3, with automatic, policy-driven archiving to lower cost Glacier storage as your data ages, along with easy, API-driven or console-powered retrieval?
Sound good? Awesome, because that's what we have! You can now use Amazon Glacier as a storage option for Amazon S3.
There are four aspects to this feature -- storage, archiving, listing, and retrieval. Let's look at each one in turn.
Storage First, you need to tell S3 which objects are to be archived to the new Glacier storage option, and under what conditions. You do this by setting up a lifecycle rule using the following elements:
A prefix to specify which objects in the bucket are subject to the policy.
A relative or absolute time specifier and a time period for transitioning objects to Glacier. The time periods are interpreted with respect to the object's creation
date. They can be relative (migrate items that are older than a certain
number of days) or absolute (migrate items on a specific date)
An object age at which the object will be deleted from S3. This is measured from the original PUT of the object into the service, and the clock is not reset by a transition to Glacier.
You can create a lifecycle rule in the AWS Management Console:
Archiving Every day, S3 will evaluate the lifecycle policies for each of your buckets and will archive objects in Glacier as appropriate. After the object has been successfully archived using the Glacier storage option, the object's data will be removed from S3 but its index entry will remain as-is. The S3 storage class of an object that has been archived in Glacier will be set to GLACIER.
Listing As with Amazon S3's other storage options, all S3 objects that are stored using the Glacier option have an associated user-defined name. You can get a real-time list of all of your S3 object names, including those stored using the Glacier option, by using S3's LIST API. If you list a bucket that contains objects that have been archived in Glacier, what will you see?
As I mentioned above, each S3 object has an associated storage class. There are three possible values:
STANDARD - 99.999999999% durability. S3's default storage option.
GLACIER - 99.999999999% durability, object archived in Glacier option.
If you archive objects using the Glacier storage option, you must inspect the storage class of an object before you attempt to retrieve it. The customary GET request will work as expected if the object is stored in S3 Standard or Reduced Redundancy (RRS) storage. It will fail (with a 403 error) if the object is archived in Glacier. In this case, you must use the RESTORE operation (described below) to make your data available in S3.
Retrieval You use S3's new RESTORE operation to access an object archived in Glacier. As part of the request, you need to specify a retention period in days. Restoring an object will generally take 3 to 5 hours. Your restored object will remain in both Glacier and S3's Reduced Redundancy Storage (RRS) for the duration of the retention period. At the end of the retention period the object's data will be removed from S3; the object will remain in Glacier.
Although the objects are archived in Glacier, you can't get to them via the Glacier APIs. Objects stored directly in Amazon Glacier using the Amazon Glacier API cannot be listed in real-time, and have a system-generated identifier rather than a user-defined name. Because Amazon S3 maintains the mapping between your user-defined object name and the Amazon Glacier system-defined identifier, Amazon S3 objects that are stored using the Amazon Glacier option are only accessible through the Amazon S3 API or the Amazon S3 Management Console.
Archiving in Action We expect to see Amazon Glacier storage put to use in a variety of different ways. Toshiba's Cloud & Solutions Division will be using it to store medical imaging. Tetsuro Muranaga, Chief Technology Executive of the division is very exciting about it. Here's what he told us:
We currently provide a service enabling medical institutions to securely store patients’ medical images in Japan. We are excited about using Amazon Glacier through Amazon S3 to affordably and cost-effectively archive these images in large volumes for each of our customers. We will combine Toshiba’s cloud computing technology with Amazon Glacier’s low costs and Amazon S3’s lifecycle policies to provide a unique offering tailored to the needs of medical institutions. In addition, we expect we can build similarly tailored integrated solutions for our wide range of customers so that they can archive massive amounts of data in various business areas.
Pricing You will pay standard Glacier pricing for data stored using S3's new Glacier storage option.
For today's episode of The AWS Report, I spoke to Colin Lazier, a Senior Development Manager on the AWS Storage Team. Colin and I talked about Amazon Glacier and how it can be used to archive data for long periods of time. I learned that Glacier uses anti-entropy techniques to guard against data loss.
We also talked about Glacier's retrieval model, and our expectation that third parties will build archiving and indexing tools around Glacier's storage and retrieval functions.
Learn the Benefits of Running a Private Social Network on AWS
[Online]
Tuesday, May 21, 2013
9:00 AM PT / 12:00 PM ET
Amazon Web Services and tibbr, an AWS Technology Partner invite you to learn how to foster innovation, improve customer support, employee motivation and breakdown departmental silos with a tibbr Private Social Network application running on AWS.
Register Now
Deliver High Performance and Scalable SQL Databases on AWS
[Online]
Wednesday, May 22, 2013
10:00 AM PT / 1:00 PM ET
Amazon Web Services (AWS) and NuoDB, an AWS Partner Network (APN) Technology Partner, invite you to attend this live webinar where you will learn how to use NuoDB to manage your data across multiple data centers and geographies to enable a highly available, secure and scalable system.
Register Now
Maximize Your Microsoft SharePoint Solutions on AWS
[Online]
Tuesday, June 4, 2013
8:00 AM PT / 11:00 AM ET
Join Amazon Web Services (AWS) and Capgemini, an AWS Premier Consulting Partner, to explore how the latest technology innovations with Microsoft SharePoint may be combined to deliver maximum business value to your customers.
Register Now
Deploying Your Business Critical SQL Server Apps on Amazon EC2
[Online]
Wednesday, June 5, 2013
10:00 AM PT / 1:00 PM ET
Amazon Web Services (AWS) and SIOS Technology Corp, an AWS Technology Partner, invite you to attend this live webinar to learn key considerations for deployment of mission critical SQL Server applications to Amazon EC2.
Register Now
Manage Big Data Analytics Using SAP HANA One On AWS
[Online]
Tuesday, June 11, 2013
10:00 AM PT / 1:00 PM ET
Jump Start Your Big Data Analytics using SAP HANA One with RunE2E and AWS. Amazon Web Services (AWS) and RunE2E, an Advanced Consulting Partner, invite you to join this live webinar to learn how SAP HANA One provides the ideal platform to manage your Big Data solutions on AWS.
Register Now
Recent Comments