Many years ago, professor Andy Tanenbaum wrote the following:
Hard drives are getting bigger more rapidly than internet connections are getting faster. It is now relatively easy to create a collection of data so large that it cannot be uploaded to offsite storage (e.g. Amazon S3) in a reasonable amount of time. Media files, corporate backups, data collected from scientific experiments, and potential AWS Public Data Sets are now at this point. Our customers in the scientific space routinely create terabyte data sets from individual experiments.
This isn't an issue that can be solved by getting a faster connection; even at the highest reasonable speed, some of these data sets would take weeks or months to upload. For example, it would take over 80 days to upload just 1TB of data over a T1 connection.
Customers with AWS storage requirements at the terabyte and petabyte level often ask us if they can sidestep the internet and simply send us a disk drive, or even a 747 full of such drives.
I can now say "Yes, you can!" Our new AWS Import/Export service allows you to ship your data to us. This service is now in a limited beta and you can sign up here. We'll take your storage device, load the data into a designated S3 bucket, and send your hardware back to you. The data load takes place in a secure facility with a high bandwidth, low-latency connection to Amazon S3. Once the data has been loaded in to S3, you can process it on EC2, and then store the results anywhere you would like -- back into S3, in SimpleDB, or on EBS volumes.
During the limited beta we are set up to accept devices with a USB 2.0 or eSATA connector, formatted as an FAT32, ext2, ext3, or NTFS file system. We are set up to handle devices that weigh less than 50 pounds and fit within an 8U rack. We are also happy to make special arrangements to accommodate larger and heavier devices. Last week after a conference talk one of the attendees asked me "Can we ship you our SAN?" In theory, yes, but we'd need to discuss the specifics beforehand.
It is easy to initiate a data load. Here's what you do:
- Load the data onto your compatible storage device.
- Create a manifest file per our specification. The file must include the name of the target S3 bucket, your AWS Access Key, and a return shipping address. You can also specify content types and S3 Access Control Lists in this file. You can also use the newest versions of third-party tools such as Bucket Explorer and S3 Fox to easily create manifest files from your S3 buckets.
- Email the manifest file to a designated address with the subject CREATE JOB.
- Await a return email with the subject RE: CREATE JOB.
- Extract the JOBID value from the email.
- Use the JOBID, manifest file, and your AWS Secret Access Key to generate and sign a signature file in the root directory of your storage device.
- Ship your storage device, along with all necessary power and data cables, to an address that we'll provide to you. You can use one of the usual shipping company or a courier service.
- Await further status emails.
Once we get the device we'll transport it to our data center and initiate the load process by the end of the next business day. A log file is created as part of the process; it will include the data and time of the load, and the S3 key, MD5 checksum, and size (in bytes) of each object. We'll reject unreadable files and those larger than 5GB and note them in the log. At the end of the process we'll send the device back to you at our expense.
In keeping with our model of charging for resources only as they are consumed, you will pay a fixed fee per device and a variable fee for each hour of data loading. There's no charge for data transfer between AWS Import/Export and an S3 bucket in the United States. Normal S3 Request and Storage charges apply.
Right now packages must be shipped from and returned to addresses in the United States. We do expect to be able to accept packages at a location in Europe in the near future.
Let's talk about security for a minute. You can choose to encrypt your files before you send them to us, although we don't support encrypted file systems. We track custody of your device from the time it arrives in our mailroom until it is shipped back to you. All personnel involved in the process have undergone extensive background checks.
Also, as you can probably guess from the name of the service, we have plans to let you transfer large amounts of data out of AWS as well. We will provide further information as soon as possible.
We have also created an Import/Export Calculator:
If you have a large amount of data stored locally and you want to get it into Amazon S3 on a cost-effective and timely basis, you should definitely sign up for the beta now. You can read more about AWS Import/Export in the Developer Guide.