My Photo

Elastic Load Balancing, Auto Scaling, and CloudWatch Resources

Here are some good resources for current and potential users of our Elastic Load Balancing, Auto Scaling, and Amazon CloudWatch features:

Version 1.8a of the popular Boto library for AWS now supports all three of the new features. Written in Python, Boto provides access to Amazon EC2, Amazon S3, Amazon SQS, Amazon Mechanical Turk, Amazon SimpleDB, and Amazon CloudFront. The Elastician Blog has some more info.

 

The Elastician Blog also has a good article with a complete example of how to use CloudWatch from Boto. After creating the connection object, one call initiates the monitoring operation and two other calls provide access to the collected statistics.

 

The Paglo monitoring system can now make use of the statistics collected by CloudWatch. You will need to install the open source Paglo Crawler on your EC2 instances. More info on Paglo can be found here.

 

The IT Architects at The Server Labs have put together some great blog posts. The first one, Setting up a load-balanced Oracle Weblogic cluster in Amazon EC2, contains all of the information needed to set up a two node cluster. The second one, Full Weblogic Load-Balancing in EC2 with Amazon ELB, shows how to use the Elastic Load Balancer to front a pair of Apache servers which, in turn, direct traffic to a three node Weblogic cluster to increase scalability and availability.

 

Speaking of availability and durability, you should definitely check out the DZone reference card on the topic. The card provides a detailed yet concise introduction to the two topics in just 6 pages. Topics covered include horizontal scalability, vertical scalability, high availability, measurement, analysis, load balancing, application caching, web caching, clustering, redundancy, fault detection, and fault tolerance.

 

Author and blogger Ramesh Rajamani wrote a detailed paper on the topic of Dynamically Scaling Web Applications in Amazon EC2. Although the paper predates the release of the Elastic Load Balancer and Auto Scaling, the approach to scaling is still valid. Ramesh shows how to use Nginx and Nagios to build a scalable cluster.

 

The Serk Tools Blog has a post on Amazon Elastic Load Balancer Setup. The post includes an architectural review of the Elastic Load Balancer service, detailed directions to create an Elastic Load Balancer instance, information about how to set up a CNAME record in your DNS server, and directions on how to set up health checks.

 

Arfon Smith wrote a blog post detailing his experience moving the Galaxy Zoo from HAProxy to Elastic Load Balancing. He notes that it took him just 15 minutes to make the switch and that he's now saving $150 per month.

 

I hope you find these resources to be helpful!

-- Jeff;

AWS Start-Up Challenge For 2009

We're kicking off the third annual AWS Start-Up Challenge now.

We're looking for the hottest and coolest start-ups and start-up ideas. Developers and entrepreneurs in the United States, United Kingdom, Germany, and Israel are encouraged to enter for a chance to win $50,000 in cash, $50,000 in AWS credits, mentoring sessions from AWS technical experts, and AWS Premium Support Gold for one year.

To enter, fill out and submit the online application by August 26, 2009. The judging panel will review all of the application and choose the seven best, based on originality and creativity, likelihood of long-term success, monetization strategy, quality of proposal, and effective use of AWS.

The finalists will be announced in October. At that time we will post a video of each finalist and invite the public to vote for their favorite. Then we'll fly all of the finalists to Silicon Valley where they'll present their ideas to the judges' panel during the day, and pitch them to a live audience of entrepreneurs and venture capitalists that night, where the winner will be chosen, annouced, and feted.

All runner-up finalists will receive $5,000 in AWS service credits; all entrants with qualified submissions will receive $25 credits.

The Challenge finalist with the most creative monetization model using the Amazon Flexible Payments Service (FPS) or Simple Pay from Amazon Payments will win $10,000 in combined cash and Amazon Payments credits. All finalists using these services will receive $2,500 in Amazon Payments credits. Read more here.

Questions? Check out the contest rules, review the prizes, and scan the FAQ. You may also want to watch the videos we made for the 2007 and 2008 finalists.

-- Jeff;

Amazon Elastic MapReduce Now Available in Europe

Earlier this year I wrote about Amazon Elastic MapReduce and the ways in which it can be used to process large data sets on a cluster of processors. Since the announcement, our customers have wholeheartedly embraced the service and have been doing some very impressive work with it (more on this in a moment).

Today I am pleased to announce Amazon Elastic MapReduce job flows can now be run in our European region. You can launch jobs in Europe by simply choosing the new region from the menu. The jobs will run on EC2 instances in Europe and usage will be billed at those rates.

Because the input and output locations for Elastic MapReduce jobs are specified in terms of URLs to S3 buckets, you can process data from US-hosted buckets in Europe, storing the results in Europe or in the US. Since this is an internet data transfer, the usual EC2 and S3 bandwidth charges will apply.

Our customers are doing some interesting things with Elastic MapReduce.

At the recent Hadoop Summit, online shopping site ExtraBux described their multi-stage processing pipeline. The pipeline is fed with data supplied by their merchant partners. This data is preprocessed on some EC2 instances and then stored on a collection of Elastic Block Store volumes. The first MapReduce step processes this data into a common format and stores it in HDFS form for further processing. Additional processing steps transform the data and product images into final form for presentation to online shoppers. You can learn more about this work in Jinesh Varia's Hadoop Summit Presentation.

Online dating site eHarmony is also making good use of Elastic MapReduce, processing tens of gigabytes of data representing hundreds of millions of users, each with several hundred attributes to be matched. According to an article on SearchCloudComputing.com, they are doing this work for $1,200 per month, a considerable savings from the $5,000 per month that they estimated it would cost them to do it internally.

We've added some articles to our Resource Center to help you to use Elastic MapReduce in your own applications. Here's what we have so far:

You should also check out AWS Evangelist Jinesh Varia in this video from the Hadoop Summit:

-- Jeff;

PS - If you have a lot of data that you would like to process on Elastic MapReduce, don't forget to check out the new AWS Import/Export service. You can send your physical media to us and we'll take care of loading it into Amazon S3 for you.

Amazon Second Life Job Fair - July 14, 2009

The very first Amazon Job Fair in Second Life will take place on Tuesday July 14th and will run from 6 AM to Midnight, PST.

This free event is a unique opportunity for candidates to have direct access to hiring managers and recruiters from around the world. Amazon is looking for all levels of technical and non-technical candidates – from hands-on engineers to program managers and game-changing principal architects. Visit our career site to see the open positions and then make plans to join us in-world.

We'll be doing first-round virtual interviews (the equivalent of a phone screen) for real-world jobs.

The Job Fair will take place on the Amazon Developers 2 island. You will need to create an avatar and then download the client in order to attend. You may also want to spend some time with the Second Life Quickstart Guide a day or so before the fair.

We'll be giving away some cool virtual goods including Amazon.com T-Shirts, Door Desks, and a high-performance aircraft. The island is open now so feel free to stop by and take a look around. We have a dog-friendly environment so don't be surprised if you see a pixelated puppy or two wandering around.

As you may have noted above, the Second Life client and the Quickstart guide are both served up from Amazon S3. The Snowglobe version of the Second Life client downloads its map tiles from S3 and the Second Life Map uses a combination of Amazon CloudFront and S3.

Decisions at Amazon are always driven by data, so I built a metrics and statistics package in anticipation of the event. A number of sensors use code written in LSL to relay events (via HTTP) to an Amazon EC2 server. There, some PHP code picks up the raw data and stores it in a set of Amazon SimpleDB domains for analysis. I plan to write an article about this in the near future, so stay tuned.

I hope to see you there. My avatar's name is Jeffronius Batra.

-- Jeff;

PS - Yes, that is an Altair 8800 on my door desk! I actually bought, assembled, and programmed one of these in 1976 (yes, I am that old).

The Cloud as a Platform for Platforms

Of the many things I love about AWS, I will mention three of my favorites in this blog post:

  • AWS does not force developers to use any particular programming model, language, or operating system.
  • AWS does not force developers to use the entire suite of services - they can use any of our infrastructure services individually or in any combination.
  • AWS does not limit developers to a pre-set amount of storage, bandwidth, or computing resources they can consume - they can use as much or as little as they wish, and only pay for what they use.

Our customers love this flexibility. Today, a developer can run more experiments and achieve results much faster than before. If something does not work in a particular environment, the developer can drop that idea, click a few buttons, dispose all of his infrastructure and move on to the next experiment; starting with a fresh, new environment. Developers can try out several new ideas simultaneously by running multiple projects concurrently. Once the ideas are implemented, they can be further battle-tested using more resources in the AWS cloud until they become finished products. Developers love this because they are able to convert their concept/idea into a successful finished product quickly. As a result, we are seeing tremendous innovation happening at break-neck speed. The Cloud is becoming a platform for Innovation. 

The inherent flexibility of the AWS cloud enables customers to use it as a Platform in variety of different ways, including: 

  • The AWS cloud as a Platform for Collaboration
  • The AWS cloud as a Platform for Computation
  • The AWS cloud as a Platform for Software and Data Delivery
  • The AWS cloud as a Platform for Hot and Cold Storage
  • The AWS cloud as a Platform for Research Development and Experimentation

Every day, I find customers in each of categories mentioned above. Some of them share their stories and architectures with us.

It does not stop there!

Its inspiring when I see the AWS cloud being used as a Platform for Platforms.

AWS is not only a rich platform to build solutions but also a platform for building specialized platforms. Customers can choose to either use the AWS cloud directly or take advantage of these value-added platforms. Customers can also mix and match platforms from this rich ecosystem.  

In this post, we look at some of the best examples of specialized platforms built on AWS:

Hero Ruby Platforms
Heroku, as most of you may already know, is one of the early platforms built on top of Amazon Web Services. This "Instant Ruby Platform" enables any Ruby developer to take their existing Ruby code and move it to cloud. Customers of Heroku do not have to worry about scaling or managing their server farm, in case there is a success disaster. Heroku deploys a Ruby app in a single step without changing the app or the process. They recently launched commercially and offer a similar pay-as you-go pricing for enterprises and hobbyists. They also offer a free tier to try and test your prototypes.  By offering Deployment platform-as-a-Service on the top of reliable Amazon's platform, everybody wins.  The end-user gets all the things they need to build a modern web-scale application quickly while Heroku manages the "magic" (via “Slugs” and “Dynos”) of Ruby deployment without worrying about the complexity of maintaining and managing the underlying infrastructure.

Engine Yard offers a rich and open Ruby deployment platform on Amazon Web Services. Developers can take advantage of Engine Yard's pre-configured standardized stacks which makes it easy to deploy a Ruby on Rails application. Using Engine Yard's wizard-style web interface, developers can create the entire environment including different Unix packages and Ruby Gems to install, setting frequencies of database backups. They have a nice video on How to deploy your Ruby app on EC2 in 10 minutes

Img_ey_software_stack


Coderun Language-agnostic Development Platform
I stumbled upon CodeRun – Online Development Platform on Amazon EC2 – few months ago and was tracking it closely. The interface looked just like Visual Studio but in-browser. End-user can code in PHP, AJAX, ASP.NET in an in-browser IDE which is fully hosted on multiple Amazon EC2 instances and then deploy the code (by clicking on “Debug” or “Run”) using several backend services that in-turn run on various other Amazon EC2 instances (all managed). Free accounts may share instances while premium accounts (which will be available in August) will run on stand-alone instances. Developers may deploy their code to Amazon EC2 more than once for testing, debugging and production purposes. Code snippets are shared on Amazon S3 and can be shared among developers (Check out AWS Code Samples). Amazon EBS is used to store users files and data while Amazon CloudFront is used distribute static files. Logs are stored on Amazon SimpleDB. They almost have a full house!

The platform includes a custom elasticity mechanism that monitors resource usage and performs automatic scaling based on dynamic set of predefined business rules. Developers can code in existing technologies (.NET/PHP/JS) and “outsource” scalability to CodeRun. With a single-click, you can deploy your app. CodeRun leverages AWS API to completely automate the deployment process. This includes allocating resources (instances, addresses and storage), copying files, synchronizing database structure and configuring the web servers. The entire platform is not fully baked yet but, I think, it has tremendous potential.

Voice Platform
Twilio is a voice platform built on the top of AWS. As Twilio’s website suggests, developer could build innovative voice apps like sales automation systems, order inquiry lines, CRM solutions, call routing apps, appointment reminders, custom voicemail apps. Platform developers at Twilio are focusing on building a powerful telephony platform on the top of Amazon Web Services. Twilio is drop-dead simple and easy to get started (“friction-free development”). With this simplicity, I think, it won’t be too long until a developer will be able to write a phone tree app that calls up all your friends from your social network about an upcoming party and get their RSVP over phone which can then be viewed on a website. Take a look at Twilio’s presentation (from the AWS Start-Up Tour in Seattle), and you will be convinced that they are AWS experts and they know what they are doing.

At VoiceCon, Siemens Enterprise Communications pre-announced that their Voice and Unified Communications product suite will be available as-a-Service on the AWS cloud. It will be interesting to see how this platform evolves.

Uccloud

Bottom line
Heroku, Engine Yard, Twilio, CodeRun are all different in nature and behavior. All of them are built using different technologies and methodologies. All are targeting different market segments. All share one thing in common. They are all built on AWS. All of them are built to scale and take advantage of flexibility. Innovation thrives in an environment that permits flexibility. AWS gives them the flexibility they need along with the scalability and elasticity their customers require.

This, to me, is very inspiring. What do you think ?

- Jinesh Varia (evangelists at amazon dot com)

Scaling to the Stars

Recently I blogged about The Server Labs, a consultancy that specializes in high-performance computing – including on Amazon Web Services.

Here’s another story that I found fascinating: nominally it is about how The Server Labs uses Amazon Web Services as a scale-out solution that also implements Oracle databases; however it’s really about space exploration (or should I say “nebula computing”). It began with an email asking whether there would be a problem running up to 1,000 Amazon EC2 High-CPU Extra-Large instances.

The Server Labs is a software development/consulting group based in Spain and the UK that works closely with the European Space Agency, and they needed to prove the scalability of an application that they helped build for ESA's Gaia project. In addition to the instances, they also requested 2 large and 3 X-Large instances to host Oracle databases that coordinate the work being performed by the high-CPU instances.

Gaia’s goal is to make the largest, most precise three-dimensional map of our Galaxy by surveying an unprecedented number of stars - more than one billion. This, by the way, is less than 1% of all stars! The plan is to launch a mission in 2011, collect data until 2017; and then publish a completed catalog no later than 2019.

I had the opportunity to see a PowerPoint deck created and presented by The Server Lab’s founder, Paul Parsons, and their software architect, Alfonso Olias, who is currently assigned to this project.

The deck explained that the expected number of samples in Gaia is 1 billion stars x 80 observations x 10 readouts, which is approximately equal to 1 x 1012 samples—or as much as 42 GB per day transferred back to Earth. There’s a slide in the deck that says “Put another way, if it took 1 millisecond to process one image, the processing time for just one pass through the data on a single processor) would take 30 years.”

As the spacecraft travels, it will continuously scan the sky in 0.7 degree arcs, sending the data back to Earth. Some involved algorithms will come into play in order to process the data; and the result is a fairly complex computing architecture that is linked to an Oracle database. Scheduling the cluster of computational servers is not quite so complicated, and is based on a scheduler that is focused on keeping each machine as busy as possible.

However the amount of data to process is not steady—it will increase over time. Which means that infrastructure needs will also vary over time. And of course idle computing capacity is deadly to a budget.

The opportunity to solve large computational problems usually turns to grid computing. No difference this time either – except that as mentioned above, the required size of the grid is not constant. Because Amazon Web Services is on-demand, it’s possible to apply just enough computational resources to the problem at any given time.

In their test, The Server Labs set up an Oracle database using an AWS Large Instance running a pre-defined public AMI. Then they mounted 5 EBS volumes of 100 GB each, and mounted them to the instance.

Then they created Amazon Machine Images (AMIs) to run the actual analysis software. These images were based on large instances and included Java, Tomcat, the AGIS software and an rc.local script to self-configure an instance when it’s launched.

The requirements break down as follows:

To process 5 years of data for 2 million stars, they will need to run 24 iterations of 100 minutes each, which works out to 40 hours running a grid of 20 Amazon EC2 instances. A secondary update has to be run once and requires 30 minutes per run, or 5 hours running a grid of 20 EC2 instances.

For the full 1 billion star project numbers extrapolate out more or less as follows: They calculated that they will analyze 100 million primary stars, plus 6 years of data, which will require a total of 16,200 hours of a 20-node EC2 cluster. That’s an estimated total computing cost of 344,000 Euros. By comparison, an in-house solution would cost roughly 720,000 EUR (at today’s prices) – which doesn’t include electricity or storage or sys-admin costs. (Storage alone would be an additional 100,000 EUR.)

It’s really exciting to see the Cloud used in this manner; especially when you realize that an entire set of problem solutions that were beyond economic possibility before the Cloud became a reality.

Mike

Webinar: How to Create Secure Test and Dev Environments on the Cloud

Amazon Web Services, CohesiveFT, and RightScale will participate in a webinar titled "How to Create Secure Test and Dev Environments on the Cloud."

Along with Michael Crandell and Edward Goldberg of RightScale, Simone Brunozzi of Amazon Web Services and Patrick Kerpan of CohesiveFT will show you how you can save time and money by running your entire testing application testing infrastructure in the cloud. They will discuss creation of an agile approach to rapid prototyping, creation of a test and development environment which replicates the final deployment environment, and will also show how to build a secure VPN environment.

The webinar is free but registration is required.

-- Jeff;

AWS Management Console Support for CloudFront

The AWS Management Console has a brand new tab:

This new tab contains full support for Amazon CloudFront. You can start distributing your content in minutes. You don't need to make a long term commitment and you don't need to download a client application. It is now even easier to access CloudFront in pay-as-you-go fashion.

After selecting the tab you will see a list of all of your CloudFront distributions:

You can select any distribution to see detailed information about it. You can also create a new distribution by clicking the Create Distribution button and filling in the resulting dialog: You can also edit the properties of any of your existing CloudFront distributions. You can enable or disable logging, add and remove CNAMEs, and even enable or disable the distribution itself.

We've also got a brand new getting started video.

This is just one of many exciting new features that we have in store for users of the console. Stay tuned to this blog for further developments.

-- Jeff;

Gowalla - Location-based iPhone 3G Application

Gowalla is a location-based iPhone game. Currently focused on Austin, Boston, New York, and San Francisco, the application runs on any iPhone 3G with the location-based services turned on. You use Gowalla to collect "virtual souvenirs" by visiting specified locations in each city, neatly combining the virtual and the real worlds.

A product of Alamofire, Gowalla runs on EC2 and makes extensive use of S3, SQS, and SimpleDB. To learn more, you can watch Alamofire founder Scott Raymond's presentation at the Scotland on Rails conference. In the video, Scott talks about the pros and cons of using AWS, how they scaled their MySQL database tier, Facebook interaction, and much more.

-- Jeff;

S3Stat - Log Analysis for Amazon CloudFront and Amazon S3

S3Stat now has the ability to analyze the logs generated by Amazon CloudFront using the venerable Webalizer package. With your permission, S3Stat will enable CloudFront logging for the distributions of your choice. It will then run Webalizer each day and deposit the report in S3 for easy browsing. Of course, as the name implies, you can also use it to analyze the logs produced by S3.

The service is free for the first month and then $5 per month thereafter.

-- Jeff;

Email Subscription

Enter your email address:

Delivered by FeedBurner

July 2009

Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31