As you probably know, Amazon CloudWatch provides monitoring services for your cloud resources and your applications. You can track cloud, system, and application metrics, see them visually, and arrange to be notified (via a CloudWatch alarm) if they go beyond a value that you specify. For example, you can track the CPU load of your EC2 instances and receive a notification (via email and/or Amazon SNS) if it exceeds 90% for a period of 5 minutes.
Today we are giving you the ability to stop or terminate your EC2 instances when a CloudWatch alarm is triggered. You can use this as a failsafe (detect an abnormal condition and then act) or as part of your application's processing logic (await an expected condition and then act).
Before we dig in, I should remind you of one thing. If you are using EBS-backed EC2 instances, you can stop them at any point, with the option to restart them later, while retaining the same instance ID and root volume (this is, of course, distinct from the associated termination option).
If you (or your developers) are forgetful, you can detect unused EC2 instances and shut them down. You could do this by detecting a very low load average for an extended period of time. This type of failsafe could be used to reduce your AWS bill by making sure that you are not paying for resources you're not actually using.
You could also implement a failsafe that would detect runaway instances (for example, CPU pegged at 100% for an extended period of time). Perhaps your application gets stuck in a loop from time to time (only when you are not looking, of course). You could also use our CloudWatch monitoring scripts to detect and act on other situations, such as excessive memory utilization).
Many AWS applications will pull work from an Amazon SQS queue, do the work, and then pass the work along to the next stage of a processing pipeline. You can detect and terminate worker instances that have been idle for a certain period of time.
You can use a similar strategy to get rid of instances that are tasked with handling compute-intensive batch processes. Once the CPU goes idle and the work is done, terminate the instance and save some money!
You can also create CloudWatch alarms based on Custom Metrics that you observe on an instance-by-instance basis. You could, for example, measure calls to your own web service APIs, page requests, or message postings per minute, and respond as desired.
Setting Up Alarm Actions
You can set up alarm actions from the EC2 or CloudWatch tabs of the AWS Management Console. Let's say you want to start from the EC2 tab. Right-click on the instance of interest and choose Add/Edit Alarms:
Choose your metrics, set up your notification (SNS topic and optional email) and check Take the action, and choose either Stop or Terminate this instance:
The console will confirm the creation of the alarm, and you're all set (if you asked for an email notification, you need to confirm the subscription within three days):
I can speak for the entire CloudWatch team when I say that we are interested in hearing more about how you will put this feature to use. Feel free to leave a comment and I'll pass it along to them ASAP.