We've created a new video tutorial, which describes how to setup a cluster of high performance compute nodes in under 10 minutes. Follow along with the tutorial to get a feel for how to provision high performance systems with Amazon EC2 - we'll even cover the cost of the resources you use, through a $20 free service credit.
Why HPC?
Data is at the heart of many modern businesses. The tools and products that we create in turn generate complex datasets which are increasing in size, scope and importance. Whether we are looking for meaning within the bases of our genomes, performing risk assesments on the markets or reporting on click-through traffic from our websites, these data hold valuable information which can drive the state of the art forward.
Constraints are everywhere when dealing with data and its associated analysis, but few are as restrictive as the time and effort it takes to procure, provision and maintain the high performance compute servers which drive that analysis.
The cluster compute instance sizes available on Amazon EC2 can greatly reduce this constraint, and give you the freedom to run high specification analysis on-demand, as and when you need them. Amazon EC2 takes care of provisioning and monitoring your compute cluster and storage, leaving you more time to dive into your data.
A guided tour
To demonstrate the agility this approach provides, I made a short video tutorial which guides you through how to provision, configure and run a tightly coupled molecular dynamics simulation using cluster compute instances. The whole cluster is up and running in under 10 minutes.
To help get a feel for this environment, we're also providing $20 of service credits (enough to cover the cost of the demo), so you can follow along with this tutorial for free. To register for your free credits, just follow the link on the tutorial page.
In addition to getting up and running quickly, each cluster compute instance is no slouch either. They use hardware virtualisation to allow your code to get closer to the dual quad core Nehalem processors, and full bi-section 10Gbps networking for high speed communication between instances. Multi-core GPUs are also available - a perfect fit for large scale computational simulation or rendering.
Just as in other fields, cloud infrastructure can help reduce the 'muck' and greatly lower the barrier of entry associated with working with high performance computing. We hope this short video will give you a flavour for things.
Get in touch
Feel free to drop me a line if you have any questions, or you can follow along on Twitter. I also made a longer form video, which includes a wider discussion on high performance computing with Amazon EC2.
~ Matt


Matt, In the video you talk about creating a template of the first cc1.4xlarge instance that has been manually configured. You selected the create image option under instance actions to do this. What happens to the data initially placed on the 100GiB volume ? Will this also be copied into the AMI and all the instances launched with this new AMI will have a separate copy of the data ? In a cluster it would be ideal if all the instances have access to the same set of files instead of separate copy local to them.
Posted by: Krishna Muriki | March 22, 2011 at 01:02 PM
Thanks, Krishna.
Creating a new AMI only takes an image of the server, along with the OS and any configuration held on the ephemeral disk. For the EBS volumes, you can take a snapshot of the volume, and use that snapshot to create new, active volumes which are then attached to instances, either at run time, or at start up.
Whilst this approach is great for a wide range of uses, you're absolutely right that in some cases access to shared data across instances is required. There are a variety of methods to set up a shared filesystem on EC2 with EBS (NFS, Gluster), and I hope to cover this in a future screencast.
Posted by: Matt Wood | March 23, 2011 at 06:37 AM
Hi Matt,
Very nice video tutorial. I'm interested in the second part of it, which you mentioned at the end. Do you have a link for it?
Thanks,
Patrick
Posted by: Patrick Hu | October 29, 2011 at 01:27 PM