A team of researchers from the University of South Florida and the University of British Columbia have written a very interesting paper, Amazon S3 for Science Grids: A Viable Solution?
In this paper the authors review the features of Amazon S3 in depth, focusing on the core concepts, the security model, and data access protocols. After characterizing science storage grids in terms of data usage characteristics and storage requirements, they proceed to benchmark S3 with respect to data durability, data availability, access performance, and file download via BitTorrent. With this information as a baseline, they evaluate S3's cost, performance, and security functionality.
They conclude by observing that many science grid applications don't actually need all three of S3's most desirable characteristics -- high durability, high availability, and fast access. They also have some interesting recommendations for additional security functionality and some relaxing of limitations.
I do have one small update to the information presented in the article! Since it article was written, we have announced that S3 is now storing 5 billion objects, not the 800 million mentioned in section II.
-- Jeff;


I read this paper with great interest!
I wanted to take this opportunity to mention we are recently working with EC2 as a backend to the Workspace Service for scientific and grid computing. I'm including a link below for more information about the EC2 integration.
- Tim Freeman
Posted by: Grid + EC2 | June 12, 2007 at 03:21 PM