From time to time, potential users of AWS ask me about the best way to set up a highly scalable architecture using Amazon EC2, S3, SimpleDB, and SQS. I'd like to challenge readers of this blog to document their AWS-powered architectures in a blog post, preferably with a diagram, and to leave comments with a link back to their posts. I'll collect them all up in a future post.
Here are a few that I have already:
Doug Kaye described the architecture behind GigaVox Audio Lite in his post, Amazon for Infrastructure-on-Demand. Doug used EC2, S3, and SQS to build the highly scalable podcast processing system behind The Conversations Network.
Doug's implementation regulates the number of EC2 instances in use by tracking the amount of time it takes to process each work item in the queues which drive the Transcoding and Assembly processes.
Don MacAskill described SmugMug's master controller (SkyNet) in SkyNet Lives! (aka EC2 @ SmugMug). Don's post doesn't include a block diagram, but it does include a cool usage graph (included at right).
Don's master controller watches 30 to 50 factors in order to make high quality scaling decisions. It was called RubberBand until it became sentient and attempted to take over the world launch several hundred Extra-Large EC2 instances simultaneously. It was then renamed SkyNet.
Per the blog post, they use SQS to maintain a queue of uploaded photos. The photos are processed on EC2 and then uploaded to S3. The graph in the blog post indicates that they are adding approximately 4 TB of new photos every month.
The AWS Developer Connection has some worthwhile how-to articles as well. In Monster Muck Mashup - Mass Video Conversion Using AWS, Mitch Garnaat shows how to use SQS, EC2, and S3 to do video conversion in a scalable way.
The article Auto-scaling Amazon EC2 with Amazon SQS also has a whole lot of really good information.
Once again, I invite you to write an architecture post of your own and to leave a link to it in the comments. I would also like to see posts which make reference to load management tools such as Scalr, RightScale, and Elastra.
Updates (before I write the next big post):
- David Kavanagh wrote in to tell me that there's a good picture in his recent article, Automated Server Pool Management in Java.
- Scaling on EC2 tells the story of WebMynd, which looks like a pretty cool Firefox extension.
- Mike Brittain has a lot of great details in his blog post about How We Built a Web Hosting Infrastructure on EC2.