Last week at the AWS re:Invent conference, we announced the results of the first-ever Amazon EC2 Spotathon. Stephen Elliott, Senior Product Manager on the Amazon EC2 team, provided us with a very informative guest post with a full recap of the proceedings.
We set up the Spotathon to publicly recognize some of the most exciting applications of EC2 Spot Instances, assessing each submission based on the cost savings, performance benefits, and compute scale achieved on Spot, as well as the overall elegance of the solution.
As you may know, Spot Instances are excess EC2 instances whose price is based on real-time supply and demand. When you request an EC2 Instance via a Spot request, you specify a bid price; as long as your bid exceeds the Spot price, you are provisioned a Spot Instance. When the Spot price exceeds your bid, your Spot Instance is interrupted. See the Spot Instances page for more details.
We are routinely impressed by the sophistication of our customers’ uses of Spot Instances, and the Spotathon was no exception: the high quality of submissions made judging extremely difficult. We noted that many of the entrants used Spot Instances to construct computational platforms that reduce costs and time-to-results, enabling their customers to experiment faster, arrive at insights sooner, and act on those insights with their cost savings.
Grand Prize: PiCloud’s Platform-as-a-Service for High Performance Computing
We are pleased to award the Grand Prize of $2,500 in AWS credit to PiCloud’s Platform-as-a-Service for high performance computing (HPC), batch processing and scientific computing. PiCloud provides high-level APIs that scientists and engineers can use to submit units of computational work—like finding nucleotide sequences in a genome, conducting oil and gas geophysics simulations, or doing financial risk analytics—rather than provisioning, administering, and tearing down instances themselves. By running 85% on Spot Instances, PiCloud provisions 50% more servers at the same cost, improves its customers’ experience by delivering results 33% faster, and saves 65% over the On-Demand price. PiCloud has served thousands of researchers who have collectively processed over 100 million jobs, and is exemplary in how it uses AWS and Spot Instances to reduce researchers’ “time to science”.
Here’s an example of how simply you can use PiCloud’s platform (taken from their homepage). Like all elegant abstractions, the simplicity of the interface belies the sophistication of the implementation:
For more details about their architecture, read their Spotathon submission.
Runner-Up: Princeton Consultants’ OptiSpotter for High Frequency Trading Research
We are also excited to award the Runner-Up Prize of $1,000 in AWS credit to Princeton Consultants’ High Frequency Trading financial research application, OptiSpotter. In the investing world, as in many others, speed to result is a crucial competitive advantage. Princeton Consultants’ realized that the computational scale and cost-competitiveness that can be achieved on Spot Instances would allow start-up hedge funds to master the sheer quantity of financial data (hundreds of terabytes) and compete against the dominant firms by enabling them to rapidly and inexpensively test and tune new investment theses. With OptiSpotter, researchers consume tens of thousands of instance hours on Spot and save up to 90% on their compute bill. More importantly, they can get feedback on their investment theses in hours or less, meaning they can iterate and tune an idea several times a day, rather than having to wait until the next morning (or for days) to back-test a new algorithm.
OptiSpotter maps massive, multi-hour jobs into thousands of small sub-jobs, queues them based on memory and I/O requirements, then it monitors the Spot price history and queues of outstanding jobs to determine the most efficient way to deploy Spot Instances. Visit www.OptiSpotter.com to learn more about OptiSpotter and Princeton Consultants, an IT and Management Consulting firm.
We’d now like to recognize several Honorable Mentions, both of which were contenders for the top prizes:
The first is Numerate’s drug discovery application built on its Numatix platform. Numatix accelerates drug discovery while reducing EC2 compute costs by over 80%. Numerate’s proprietary machine learning algorithms predict the properties of small (drug-like) molecules and runs Numatix on EC2 Spot Instances to scale to 10,000 cores to search large sets of molecules (>100 million) and identify those likely to lead to new drugs. All this for $100 per hour. Numerate’s use of Spot Instances enables them to search enormous chemistry spaces in hours, and flexibly decide how fast they require results and how deep to conduct their analyses. Numerate plans to open Numatix up for broader use beyond drug discovery and is another exemplary case of a powerful cloud solution that reduces computational costs and time-to-results so that scientists can rapidly iterate on their discoveries. See their website for more information about their powerful solution.
The second is Lawrence Berkeley National Laboratories’ (LBL) Turbine Science Gateway, developed by Joshua Boverhof. The Turbine Science Gateway (TSG) supports the Department of Energy’s Carbon Capture Simulation Initiative (CCSI) by providing a web application and execution environment for running and managing scientific applications (like AspenTech’s AspenPlus) and storing and archiving results. Utilizing TSG, simulation runs that would take months on a single machine can be done overnight on EC2, running tens of thousands of simulations on hundreds of Spot instances and saving over 70% on EC2 compute costs.
Other Notable Submissions
The following compelling Spot use cases made the judging even more difficult and demonstrate the broad applicability of opportunistic computing on Spot Instances:
- The World Resources Institute’s Forest Monitoring for Action (FORMA) project analyzes MODIS satellite imagery to detect forest clearing events, using Spot Instances and Elastic Map Reduce (EMR) to conduct 25,000+ CPU-hour runs. FORMA saves 73% by using Spot versus On Demand and, more importantly, it achieves finer temporal and spatial resolution than ever before: new maps used to lags forest clearing events by several years, but FORMA can generate them every several weeks as imagery becomes available. See FORMA’s presentation on deforestation in Sumatra.
- Althea’s Shufflr is a multi-screen personalized social video discovery service. Spot Instances serve their social graph data fetchers, video indexers and filters, background (asynchronous) tasks, and test (staging/pre-production) environments, save them over 75% versus On Demand, and helped them design a fault-tolerant and cost-effective EC2 architecture. See their blog post for more details.
Scientific and Engineering Simulations
- Stanford University’s Jelena Vuckovic Lab’s Maxwell, designed by Jesse Lu, harnesses hundreds of Spot Instances to rapidly solve large numbers of electromagnetic simulations in parallel without requiring users to leave their local Matlab environment, returning results faster and achieving 80% savings off of On-Demand prices.
- Upverter incorporates Spot Instances into its enterprise-level cloud engineering platform to save money while boosting the speed of for Electronic Design Automation jobs.
Big Data and Machine Learning
- Myrrix’s Scalable Recommender System is a real-time, scalable recommender system (built on Apache Mahout) that uses Spot with EMR to increase the scale and speed of its Hadoop computation layer by a factor of three. This post details how Myrrix processed 337 million Wikipedia links, consuming 1,320 instance-hours (normalized to m1.small) over 10 hours, for under $44 (a savings of 67% versus On Demand).
- Qubole’s big data Platform-as-a-Service provides fast and reliable access to large unstructured data-sets hosted on Amazon’s S3 and EC2 services. To optimize its costs while guaranteeing a stable Hadoop cluster, Qubole auto-scales its clusters using both On-Demand and Spot Instances, resulting in a 30% lower EC2 bill. See Qubole’s blog post about their system’s design.
- BuildFax’s Multi-Cloud Mirror uses a stateless Spot application to conduct file-mirroring operations on sets of millions of addresses of building, remodeling and repair data. Using RightScale’s Cloud Management platform, it schedules its mirroring tasks to ensure its Recovery Point Objective of 6 hours. See BuildFax’s blog post and source code for more details.
- Finally, ed2c’s application was a simple yet perfect use of Spot. For cents on the dollar, ed2c hosts Berkeley Open Infrastructure for Network Computing (BOINC) applications (e.g., SETI@home or einstein@home) on Spot to help advance science and discovery; Spot interruptions are a non-issue because BOINC is inherently fault-tolerant.
As the diversity of submissions reflects, by making your application fault-tolerant, a broad range of use cases can take advantage of the scale and cost-effectiveness of Spot Instances. To learn more about Spot and how to use it, please visit the Spot Instances web page.
Finally, thanks to everyone who entered the Amazon EC2 Spotathon. We gained invaluable insight into how you’re leveraging Spot, and we look forward to seeing what more can be done!