Many years ago, professor
Andy Tanenbaum
wrote the following:
Never underestimate the bandwidth of a station wagon full of tapes
hurtling down the highway.
Since station wagons and tapes are both on the verge of obsolescence,
others have updated this nugget of wisdom to reference DVDs and
Boeing 747s.
Hard drives are getting bigger more rapidly than
internet connections are getting faster. It is now relatively easy
to create a collection of data so large that it cannot be uploaded to offsite
storage (e.g. Amazon S3) in
a reasonable amount of time. Media files, corporate backups,
data collected from scientific experiments, and potential AWS
Public Data Sets
are now at this point. Our customers in the scientific space
routinely create terabyte data sets from individual experiments.
This isn't an issue that can be solved
by getting a faster connection; even at the highest reasonable
speed, some of these data sets would take weeks or months to
upload. For example, it would take over 80 days to upload just
1TB of data over a T1 connection.
Customers with AWS storage requirements at the terabyte and petabyte level
often ask us if they can sidestep the internet and simply send us
a disk drive, or even a 747 full of such drives.
I can now say "Yes, you can!" Our new
AWS Import/Export
service allows you to ship your data to us.
This service is now
in a limited beta and you can sign up
here.
We'll
take your storage device, load the data into a designated S3 bucket,
and send your hardware back to you. The data load takes
place in a secure facility with a high bandwidth,
low-latency connection to
Amazon S3. Once the data
has been loaded in to S3, you can process it on EC2,
and then store the results anywhere you would like --
back into S3, in
SimpleDB,
or on
EBS
volumes.
During the limited beta we are set up to accept devices with
a USB 2.0 or eSATA connector, formatted as an
FAT32,
ext2,
ext3, or
NTFS file system.
We are set up to handle devices that
weigh less than 50 pounds and fit within an
8U rack.
We are also happy to make special arrangements to accommodate
larger and heavier devices. Last week after a conference talk one
of the attendees asked me "Can we ship you our SAN?" In theory, yes,
but we'd need to discuss the specifics beforehand.
It is easy to initiate a data load. Here's what you do:
- Load the data onto your compatible storage device.
- Create a manifest file per our specification. The file must
include the name of the target S3 bucket, your AWS Access Key,
and a return shipping address. You can also specify content
types and S3 Access Control Lists in this file. You can
also use the newest versions of third-party tools such as
Bucket Explorer
and
S3 Fox
to easily create manifest files from your S3 buckets.
- Email the manifest file to a designated address with the subject
CREATE JOB.
- Await a return email with the subject RE: CREATE JOB.
- Extract the JOBID value from the email.
- Use the JOBID, manifest file, and your AWS Secret Access Key to generate and
sign a signature file in the root directory of your storage device.
- Ship your storage device, along with all necessary power and
data cables, to an address that we'll provide to you. You can
use one of the usual shipping company or a courier service.
- Await further status emails.
Once we get the device we'll transport it to our data center and
initiate the load process by the end of the next business day.
A log file is created as part of the process;
it will include the data and time of the load, and the S3 key,
MD5 checksum, and size (in bytes) of each object. We'll reject
unreadable files and those larger than 5GB and note them in the log.
At the end of the process we'll send the device back to you at
our expense.
In keeping with our model of charging for resources only as
they are consumed, you will pay a fixed fee per device
and a variable fee for each hour of data loading. There's no
charge for data transfer between
AWS Import/Export and an
S3 bucket in the United States. Normal S3 Request and Storage charges apply.
Right now packages must be shipped from and returned to addresses
in the United States. We do expect to be able to accept packages at
a location in Europe in the near future.
Let's talk about security for a minute. You can
choose to encrypt your files before you send them to us, although
we don't support encrypted file systems. We track custody
of your device from the time it arrives in our mailroom until
it is shipped back to you. All personnel
involved in the process have undergone extensive background
checks.
Also, as you can probably guess from the name of the service, we have plans to
let you transfer large amounts of data out of AWS as well. We will
provide further information as soon as possible.
Here are some preliminary screen shots of the AWS Import/Export
support in
Bucket Explorer
and
S3 Fox:
We have also created an
Import/Export Calculator:
If you have a large amount of data stored locally and you want to get it into
Amazon S3 on a cost-effective and timely basis, you should
definitely
sign up for the beta
now. You can read more about
AWS Import/Export in the
Developer Guide.
-- Jeff;
Recent Comments