My Photo
E-Commerce Service
Amazon E-Commerce Service (ECS) exposes Amazon's product data and e-commerce functionality.

Elastic Compute Cloud
Amazon Elastic Compute Cloud is a web service that provides resizable compute capacity in the cloud.

Historical Pricing
The Amazon Historical Pricing web service gives developers programmatic access to over three years of actual sales data for books, music, videos, and DVDs.

Mechanical Turk
One of the best ways to understand Amazon Mechanical Turk is to complete a HIT and see what the experience is like.

Simple Storage Service
Amazon S3 is storage for the Internet. It is designed to make web-scale computing easier for developers.

Simple Queue Service
Amazon Simple Queue Service offers a reliable, highly scalable hosted queue for storing messages as they travel between computers.

Alexa Thumbnails
All thumbnail images are accessible via web services, using SOAP or REST.

Alexa Top Sites
The Alexa Top Sites web service provides ranked lists of the top sites on the Internet.

Alexa Web Information Service
The Alexa Web Information Service makes Alexa's vast repository of information about the traffic and structure of the web available to developers.

Alexa Web Search
The Alexa Web Search web service offers programmatic access to Alexa's web search engine.

Amazon SimpleDB Case Studies - ShareThis and Alexa

Aws_share_this_simpledb Even though Amazon SimpleDB is still a beta product, progressive developers are already learning about it and building highly scalable applications. In fact, we just released a pair of case studies.

ShareThis has been deployed to over 30,000 web sites. Faced with rapid growth, the team considered three storage options and chose SimpleDB for  its responsiveness, reliability, zero software cost, minimal staff costs, and low barrier to development. They used EC2, SimpleDB, S3, and SQS to build a complete loosely coupled and fault tolerant system in the cloud, with an estimated savings of $200,000. Read all about it!

The Alexa Site Thumbnail  Service uses SimpleDB to store intermediate status and log data, allowing them to store and deliver millions of thumbnails. They store over 12 million objects in SimpleDB and perform over 5 million queries every day. Read all about it!

-- Jeff;

Redundant Disk Storage Across Multiple EC2

M_david_preparing_for_ec2_persisten XML Hacker M. David Peterson has put together a really interesting article.

As part of his work at 3rd and Urban, he has implemented redundant, fault-tolerant, read-write disk storage on Amazon EC2 using a number of open source tools and applications including LVM, DRBD, NFS, Heartbeat, and VTUN.

Mark notes that "the primary focus of this paper is to present both a detailed overview as well as a working code base that will enable you to begin designing, building, testing, and deploying your EC2-based applications using a generalized persistent storage foundation, doing so today in both lieu of and in preparation for release of Amazon Web Services offering in this same space."

The article provides complete implementation details and links to source code for the scripts that Mark developed.

You can read the article, and you can also follow progress via the discussion group.

-- Jeff;

Use Amazon SQS to Build Self-Healing Applications

Quite a few people ask us about best practices that they should consider when architecting solutions in the cloud. This post covers just one best practice: how to use Amazon Simple Queue Service to build self-healing applications. The basic idea is that you can create resilient and self-healing applications by implementing a Services Oriented Architecture that follows these three principles:

  1. Each component operates on its own
  2. Without relying on the component before or after it
  3.  
           
    • Read from and write to a message queue at the boundary of each workflow stage in your application
    •    
    • If the component fails, restart automatically
    •  
  4. Design for n + 1

Rather than repeat the details here, I just posted a short five-minute video on this subject on the Amazon Web Services Resource Center. Click here to view it in either Windows Media or Flash formats.

Also, the following links are useful references for learning about how to use Amazon SQS and Amazon EC2 together:

 

Get started with Amazon SQS and Amazon EC2
Sample application to get started with Amazon SQS and Amazon EC2
SQS-EC2 Job Processor Sample AMI

-- Mike

High Performance Multithreaded Access to Amazon SimpleDB

Simpledb_s3_query_sample_2 We have just released a new code sample.

Written in Java, this new sample shows how Amazon SimpleDB can be used as a repository for metadata which describes objects stored in Amazon S3. The code was written to illustrate best practices for indexing S3 data and for getting the best indexing and query performance from SimpleDB.

Indexing is implemented at two levels. At the first level, multiple threads (implemented using the Java Executor) are used to ensure that a number of S3 reads and a number of SimpleDB writes are taking place simultaneously. At the second level, Amazon SQS is used to coordinate index tasks running on multiple systems, leading to an even higher degree of concurrency.

Bulk queries are implemented using a pair of thread pools. The first pool runs SimpleDB queries and the second retrieves SimpleDB attributes. With the proper balance between the two pools, a Small Amazon EC2 instance was able to make over 300 requests per second.

Check it out!

-- Jeff;

Amazon S3 Copy API Ready for Testing

Copying_s3_objects A few weeks ago we asked our developer community for feedback on a proposed Copy feature for Amazon S3. The feedback was both voluminous and helpful to us as we finalized our plans and designed our implementation.

This feature is now available for beta use; you can find full documentation here (be sure to follow the links to the detailed information on the use of this feature via SOAP and REST). Copy requests are billed at the same rate as PUT requests: $.01 for 1000 in the US, and $.012 for 1000 in Europe.

In addition to the obvious use for this feature -- creating a new S3 object from an existing one -- you can also use it to rename an object within a bucket or to move an object to a new bucket. You can also update the metadata for an object by copying it to itself while supplying new metadata.

Still on the drawing board is support for copying between US and Europe, and a possible conditional copy feature. Both of these items surfaced as a result of developer feedback.

Tool and library support for this new feature is already starting to appear; read more about that in this discussion board thread.

-- Jeff;

On Condor and Grids

There is lots of buzz about Hadoop and Amazon EC2—and of course there should be, given all the great projects such as the one that the New York Times one, where they converted old articles into PDF files in short order at a very reasonable cost.

There’s a second environment you should know about, although the buzz level is a bit lower. (That might change.) Condor is a scheduling application that is commonly used in HPC and grid applications. It can also be used to manage Hadoop grids, and manages “jobs” in much the same manner as mainframes—that is, you submit a job to Condor, along with metadata that describes the job’s characteristics. Then Condor finds suitable resources to allocate for the job. Note that Condor and Hadoop are trying to solve things in independent ways--with the result that they overlap in some ways, while doing unrelated things in some cases.

This week I attended Condor Week at the University of Wisconsin in Madison. Condor Week is an annual event that gives Condor collaborators and users the chance to exchange ideas and experiences, to learn about latest research, to experience live demos, and to influence our short and long term research and development directions.

If you are interested in large-scale grid computing, this approach is worth a serious look. There are two active projects that implement Condor on Amazon EC2, and of course that’s why this blog entry is being posted.

Cycle Computing offers Amazon EC2 plus Condor as an integrated platform, in addition to supporting other underlying computing resources. Their software automates Condor grid management, including monitoring, configuration, version control, usage tracking, and more. At the conference Jason Stowe from Cycle Computing made a very strong case for using Amazon EC2 instead of a traditional grid environment. Jason’s presentation is available for download at http://www.cs.wisc.edu/condor/CondorWeek2008/condor_presentations/stowe_cycle.pdf.

Red Hat’s approach integrates EC2 directly into the Condor code base. The result is that an Amazon EC2 instance is the “Condor Job”, and in that manner they are able to manage the entire life cycle of an EC2 Instance. In some cases the entire Condor pool is running on EC2, and in other cases EC2 augments an existing pool. All of this work was done by collaboration between the University of Wisconsin (Jaeyoung Yoon , Fang Cao, and Jaime Frey, along with Matt Farrellee from Red Hat. They plan to integrate Amazon S3 as a storage medium in the near future.

One thing seems certain: on-demand virtualization brightens the lights in Grid Computing City, because organizations who could not afford a grid suddenly find themselves with both affordable infrastructure and powerful tools to manage their new-found tool.

-- Mike

More Bits for Your Money - AWS Bandwidth Pricing Reduced

We've been working to drive down our costs and to pass the savings along to our customers. We've focused on bandwidth costs and are happy to announce that the cost of outbound bandwidth (for data transferred from within AWS to the outside world) has been reduced effective May 1, 2008. The old and new costs are as follows:

Monthly Transfer Old Price / GB New Price / GB
First 10 TB $0.180 $0.170
Next 40 TB $0.160 $0.130
Next 100 TB $0.130 $0.110
>=150 TB $0.130 $0.100

Note that there's an entirely new pricing tier, for customers with outbound monthly transfer in excess of 150 Terabytes.

As noted in the forum post, a customer with 50 TB of monthly transfer will save 16% and a customer with 500 TB of monthly transfer will save 26%. Earlier this year we let the world know that the total bandwidth consumed by Amazon EC2 and S3 is greater than that consumed by all of our global web sites put together.

We've also updated the AWS Simple Calculator Utility to reflect the new prices.

-- Jeff;

Two Good Podcasts

Rightscale_mashable_podcast I hardly ever listen to broadcast radio in my car anymore. Instead, I subscribe to a whole bunch of podcasts, some technical, some fun, and others educational. Here are two episodes which should be of interest to anyone who reads this blog:

The Mashable Podcast interviews Michael Crandell, CEO of RightScale. Michael talks about their product and how it helps organizations to use Amazon EC2 in a cost-effective fashion.

The IT Conversations Podcast captures Amazon CTO Werner Vogels as he talks about AWS at last years ETech conference.

You can listen to either or both of these on the respective sites or you can simply subscribe to their RSS feeds.

-- Jeff;

PS - Congratulations are due to to RightScale for the successful completion of their fund raising endeavor.

Friday Lunch Meetup in New York

New_york_2006_july I'll be in New York this coming Friday, the second leg of a trip to Washington, DC and New York.

Via Twitter, Tristan Louis suggested a lunch meetup and I was happy to oblige. We'll be meeting at the Union Square Coffee Shop at 12:30 on Friday the 2nd of May and you are welcome to come along.

I will have a couple of hours open in the afternoon and would be happy to have a private meeting or two as well. Just leave a note in the Wiki and send a confirming email to evangelists at amazon.com.

-- Jeff;

New Release of ElasticFox

Many people have told me that they have used the ElasticFox extension for Firefox to get started with Amazon EC2. ElasticFox makes it easy to see the list of available AMIs (Amazon Machine Images), to launch any number of instances of those AMIs, and to monitor and manage the running instances:

Elastic_fox_14_2

We just released version 1.4 of this powerful tool. In addition to wiping out some bugs related to security groups and key management, ElasticFox now supports all of the features of the newest version of the EC2 API - Availability Zones, Elastic IPs, and user-selectable kernels. There are new tabs for kernels and ramdisks, Elastic IPs, and Availability Zones:

Elastic_fox_14_tabs

An IP address can be allocated and then attached to a running EC2 instance with a couple of clicks:

Elastic_fox_14_ip

Elastic_fox_14_ip_assoc

New instances can be launched in any availability zone, with full control of the kernel (AKI) and ramdisk (ARI):

Elastic_fox_14_launch

Finally, you can now filter the AMI list using the box at the top right:

Elastic_fox_14_filter_2

I added this feature myself because I had been spending too much time scrolling through the ever-expanding list of available AMIs during my conference and user group demos.

And that brings me to my last point: ElasticFox is an open source project hosted on SourceForge. It was easy to download the code to my desktop machine (I used TortoiseSVN), install FireBug, figure out how the code worked, and to make and test my changes.

We've got ideas for even more features, but there's no reason to wait for us. If you have some ideas of your own, grab the code, do your thing, and send us your code for review and checkin.

-- Jeff;

PS - We are planning to release a version of this extension which is compatible with version 3 of Firefox. This version is well under way, but we didn't want to hold up release of these great new features in anticipation of the production release of Firefox 3.

Update: If you are brave and somewhat fault-tolerant, you can download and try out the Firefox 3 version here. This version is reportedly faster, and also more responsive -- the UI doesn't freeze when the extension makes background calls to EC2. Please file bugs as you find them (you will need a SourceForge account in order to do so).

May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31