My Photo
E-Commerce Service
Amazon E-Commerce Service (ECS) exposes Amazon's product data and e-commerce functionality.

Elastic Compute Cloud
Amazon Elastic Compute Cloud is a web service that provides resizable compute capacity in the cloud.

Historical Pricing
The Amazon Historical Pricing web service gives developers programmatic access to over three years of actual sales data for books, music, videos, and DVDs.

Mechanical Turk
One of the best ways to understand Amazon Mechanical Turk is to complete a HIT and see what the experience is like.

Simple Storage Service
Amazon S3 is storage for the Internet. It is designed to make web-scale computing easier for developers.

Simple Queue Service
Amazon Simple Queue Service offers a reliable, highly scalable hosted queue for storing messages as they travel between computers.

Alexa Thumbnails
All thumbnail images are accessible via web services, using SOAP or REST.

Alexa Top Sites
The Alexa Top Sites web service provides ranked lists of the top sites on the Internet.

Alexa Web Information Service
The Alexa Web Information Service makes Alexa's vast repository of information about the traffic and structure of the web available to developers.

Alexa Web Search
The Alexa Web Search web service offers programmatic access to Alexa's web search engine.

« April 2008 | Main | June 2008 »

IntrIdea Rolls out Scalr, Updates MediaPlug, Builds Community

The folks at Intridea have been coding up a storm!

In time for this week's Railsconf, they are rolling out a hosted, on-demand version of their popular Scalr tool, a new release of their MediaPlug media server appliance, and the Acts as Community social network.

Media_plug_scalr Scalr gives enterprise IT professionals the ability to quickly and easily set up and run EC2-powered server farms. Once such a farm has been set up, Scalr monitors and maintains it, with automatic scaling, failover, and redundancy. Scaling is based on load averages, with automatic instantiation of new instances of the proper type once the aggregate load average reaches a configurable threshold. If an instance crashes, Scalr replaces it with a new instance of the proper type. The hosted version is available for $50 per month; the open source version can be downloaded here.

MediaPlug is an EC2-powered media server appliance, packaged and sold as an AMI (Amazon Machine Image). It supports transcoding, uploading, and storage of  images, audio files, and video files in a number of popular formats. There's a MediaPlug web service for direct integration into backend code, along with a JavaScript library for the front-end. MediaPlug is priced at double the cost of a Small EC2 instance, or approximately $150 per month.

Finally, Acts As Community, is a social network for Ruby developers. Of course, the site runs on Amazon EC2 and is managed by Scalr. The site includes forums and blogs for open source projects hosted at GitHub; user group creation and management tools, personal profiles, and a number of community features such as forums, questions and answers, media sharing, and code sharing.

-- Jeff;

More EC2 Power

Ec2_high_cpu Amazon EC2 users now have access to a pair of new "High-CPU" instance types. The new instance types have proportionally more CPU power than memory, and are suitable for CPU-intensive applications. Here's what's now available:

The High-CPU Medium Instance is billed at $0.20 (20 cents) per hour. It features 1.7 GB of memory, 5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute Units Each), and 350 GB of instance storage, all on a 32-bit platform.

The High-CPU Extra Large Instance is billed at $0.80 (80 cents) per hour. It features 7 GB of memory, 20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each), and 1,690 GB of instance storage, all on a 64-bit platform.

The AWS Simple Monthly Calculator now supports these new instance types.

We've been working with a number of tool vendors to line up early support for this important new feature. I plan to update the blog post several times in the coming days as this support becomes available.

-- Jeff;

Napera Networks - Network Health Solution

Seattle-based Napera Networks has built a really interesting piece of hardware. The Napera N24 Appliance automatically enforces health and identity rules for up to 200 computers, preventing rogue or unpatched machines from accessing a corporate network in an uncontrolled fashion.

Naperadashboard The device basically protects access points in conference rooms, sales offices and wireless networks from indiscriminate usage. Napera uses the Microsoft NAP (Network Access Protection) protocol to enforce a fully configurable set of health and safety rules. Logging and monitoring features provide auditing and other historical information.

At this point you are probably thinking "Uh, Jeff, you work for Amazon.com and you usually talk about software on this blog. What's going on here?"

So here's where it gets really cool. The Napera N24 appliance securely stores all of its long-term logging and auditing data in Amazon S3. This way, there's no limit to how much can be logged or how long it is retained, unlike a more traditional design which would retain some logs within the appliance or require local servers. Further, the administrative interface (MyNapera.com) leverages Amazon EC2. Ultimately MyNapera.com will be accessible remotely, making the Napera appliance a great solution for a company with a number of geographically distributed field offices.

Enterprise solutions to this sort of problem are usually very complex and expensive, requiring multiple servers. Using AWS has enabled Napera to build a product for the budget of smaller companies that is incredibly easy to use and quick to install.

Oh yeah, one last thing. The company is based in Mercer Island (near Seattle). Per their jobs page, they are looking for a web developer and some summer interns with CS or IT experience.

-- Jeff;

New York Times TimesMachine

Nyt_titanic_sinks_2 Derek Gottfrid and his colleagues at the New York Times have obviously been having a lot of fun with Amazon EC2.

Their latest offering is the TimesMachine. Print subscribers can access any issue of the New York Times, dating back to Volume 1, Number 1 in 1851. Non-subscribers can take a peek at 6 different (and historically significant) issues, including the inaugural edition, the end of World War I, and the sinking of the Titanic.

As they explained in their blog post, they used EC2, Hadoop, and some of their own code to convert 405,000 large TIFF images, 3.3 million SGML files, and 405,000 XML files to 810,000 PNG images and 405,000 JavaScript files. This didn't take all that long:

"By leveraging the power of AWS and Hadoop, we were able to utilize hundreds of machines concurrently and process all the data in less than 36 hours."

The content itself is really interesting, but I also enjoyed the fact that it was possible to see the articles in the context of the other issues of the day. The advertising is also interesting.

Robert Scoble has more coverage, including a video interview with Derek.

-- Jeff;

Vertica Webinar

I will be participating in a webinar on Thursday, May 22nd at 11 AM PST. Hosted by AWS user Vertica, the webinar will cover Vertica's cloud-based approach to analytic data management. The webinar is free, but you will have to register in advance if you would like to attend.

Vertica_column_orientedThe Vertica Analytic Database runs on Amazon EC2 and S3 and is hosted completely within the Amazon cloud. Using this approach, they are able to smoothly scale to meet large and complex workloads, while also supporting automatic replication, failover, and recovery.

Unlike a traditional database install where you would have to pay for a data center, hardware, software, and administrators before you can store a single row, the Vertica solution is priced on a per-month, per-node basis. New nodes are available 30 minutes after receipt of order! In true cloud-based fashion, payment is handled through the Amazon Flexible Payments Service.

A traditional relational database takes a row-oriented approach to data storage. A fixed or variable block of contiguous space is allocated to each row. Vertica, by contrast, takes a column-oriented approach (hence the company name). The data is grouped by column instead of by row. This opens the door to many types of optimizations. Processing a single column of a database which has a large number of rows becomes very efficient, as does compression. Benchmarks indicate that approach can be 30 to 200 times faster.

I will be speaking about EC2 and S3 and about some general cloud computing concepts. I hope that you can join in.

-- Jeff;

Lots of Bits

In January of 2008 we announced that the Amazon Web Services now consume more bandwidth than do the entire global network of Amazon.com retail sites.

Amazon.com CEO Jeff Bezos has been showing a chart of the relative bandwidth usage and I just received permission to post it here:

Aws_bandwidth

Pretty cool, huh?

-- Jeff;

Cloud Studio

Cloudtools Alexsey and Tatyana from Cloud Services dropped me an email to tell me about the beta release of their new Cloud Studio product.

Cloud Studio is a Java application for the management of Amazon EC2 instances. It features a multi-pane interface with a list of available AMIs, a list of running instances, and access to keypairs, security groups, and  IP addresses. Menu options are provided for image registration and deletion, keypair manipulation, security group editing, and IP address assignment.

The application can be run standalone or it can be run from within Eclipse.

You can see a Flash demo on the home page, or you can simply download it.

-- Jeff;

New FPS Marketplace Widget and More FPS Features

Thirdpartydevwidget The new Amazon FPS (Flexible Payments Service) Marketplace Widget gives developers the ability to create a widget which can move money between two other parties, with complete control of fees paid to the developer. Money moves from buyer to seller, and the seller pays a fee to the developer. The fee can be a fixed amount and/or a percentage of the transaction.

This new widget is ideal for creating shopping carts, e-commerce platforms, and marketplaces. Prior to the introduction of this widget, developers would need to make multiple calls to FPS and to juggle several tokens to implement the same functionality.

Once you are signed up for FPS, you can use our interactive builder tool to create the HTML for a Marketplace Widget in minutes.You'll need to write a bit of server-side code to insert the proper request signature into the HTML, but that's about it.

We've also added some new features to FPS itself and to the existing Pay Now Widget:

  • API-level support for the fee mechanism used by the Marketplace Widget.
  • The Pay Now Widget now supports "reserve and settle," allowing funds to be reserved on a credit card without making an actual charge.
  • Both widgets now support an Instant Payment Notification (IPN) feature. An HTTP POST call is made to a third-party URL each time that a payment or refund is processed.
  • A new Refund call automates the process of generating refunds.
  • Additional transaction details, including the buyer's email address, are now available to sellers.

-- Jeff;

Help Wanted: More AWS Job Openings

Earlier today I met with the AWS Platform (AWSP) team.

This team is responsible for the heavy duty infrastructure pieces which are common to our line of web services. These vital pieces of our infrastructure take care of authenticating and authorizing requests for AWS services, capturing usage information, billing customers for usage, analyzing large volumes of data for internal and external reporting, and so forth. Needless to say, these internal components need to be perfectly reliable, highly scalable, and very efficient. The team is also responsible for SQS, the Amazon Simple Queue Service.

The team is in expansion mode and wanted to make sure that the world was aware of their open positions. Here's what they are looking for:

Principal Software Development Engineer - "As a Principal Engineer on the AWS Platform team, you will drive, architect and implement core functionalities in the Billing, Accounts, Products and Payments domains that has stringent Service Level Agreements (SLAs) for Availability, Reliability and Performance. You will play a critical role in shaping the overall structure of Amazon’s web-service offerings."

Software Development Engineer - "As a software developer on the AWS Platform team, you will work independently with software engineers and program/product managers to create distributed applications and services, front-end API’s for developers and website owners to consume and be responsible for the design and development of various aspects of the AWS Platform."

Software Development Engineer, Amazon SQS - "As a software engineer for SQS, you will find some challenging projects from our rapid growth that need creative solutions.  You should be somebody who enjoys working on complex system software, is customer-centric, and feels strongly not only about building good software but about making that software achieve its goals in operational reality."

Technical Program Manager, Authentication and Authorization - "The AWSP Authentication and Authorization team is seeking a dynamic, entrepreneurial Technical Program Manager (TPM) to drive the technical strategy and execution of key security services. As a key member of a small team, you will have the unique opportunity to influence and shape the entire AWSP strategy.  You will have complete end-to-end technical responsibility for the component you own from defining the product roadmap to initiating, defining and executing the projects necessary to make the roadmap a reality."

Technical Program Manager, Business Intelligence - "The AWS Platform (AWSP) team delivers business intelligence to over 100 internal customers, and the technology gets leveraged to deliver reports to a diverse community of external customers. Amazon Web Services has a culture of data-driven decision-making, and demands business intelligence that is timely, accurate, and actionable. If you join the AWSP team your work will have an immediate influence on day-to-day decision making at Amazon Web Services."

You might want to learn more about Amazon, about our values or our benefits.

As someone who has been at Amazon for almost 6 years, I have to say that this is truly a great place to work! The work itself is varied, interesting, and ever-challenging. There's a very deep, and very solid technical and managerial backbone in place. People aren't afraid to dive deep in order to solve thorny problems, and are rewarded for doing so. Our environment is friendly to dogs, and we like to have fun while we change the world.

-- Jeff;

Amazon SimpleDB Case Studies - ShareThis and Alexa

Aws_share_this_simpledb Even though Amazon SimpleDB is still a beta product, progressive developers are already learning about it and building highly scalable applications. In fact, we just released a pair of case studies.

ShareThis has been deployed to over 30,000 web sites. Faced with rapid growth, the team considered three storage options and chose SimpleDB for  its responsiveness, reliability, zero software cost, minimal staff costs, and low barrier to development. They used EC2, SimpleDB, S3, and SQS to build a complete loosely coupled and fault tolerant system in the cloud, with an estimated savings of $200,000. Read all about it!

The Alexa Site Thumbnail  Service uses SimpleDB to store intermediate status and log data, allowing them to store and deliver millions of thumbnails. They store over 12 million objects in SimpleDB and perform over 5 million queries every day. Read all about it!

-- Jeff;

Redundant Disk Storage Across Multiple EC2

M_david_preparing_for_ec2_persisten XML Hacker M. David Peterson has put together a really interesting article.

As part of his work at 3rd and Urban, he has implemented redundant, fault-tolerant, read-write disk storage on Amazon EC2 using a number of open source tools and applications including LVM, DRBD, NFS, Heartbeat, and VTUN.

Mark notes that "the primary focus of this paper is to present both a detailed overview as well as a working code base that will enable you to begin designing, building, testing, and deploying your EC2-based applications using a generalized persistent storage foundation, doing so today in both lieu of and in preparation for release of Amazon Web Services offering in this same space."

The article provides complete implementation details and links to source code for the scripts that Mark developed.

You can read the article, and you can also follow progress via the discussion group.

-- Jeff;

Use Amazon SQS to Build Self-Healing Applications

Quite a few people ask us about best practices that they should consider when architecting solutions in the cloud. This post covers just one best practice: how to use Amazon Simple Queue Service to build self-healing applications. The basic idea is that you can create resilient and self-healing applications by implementing a Services Oriented Architecture that follows these three principles:

  1. Each component operates on its own
  2. Without relying on the component before or after it
  3.  
           
    • Read from and write to a message queue at the boundary of each workflow stage in your application
    •    
    • If the component fails, restart automatically
    •  
  4. Design for n + 1

Rather than repeat the details here, I just posted a short five-minute video on this subject on the Amazon Web Services Resource Center. Click here to view it in either Windows Media or Flash formats.

Also, the following links are useful references for learning about how to use Amazon SQS and Amazon EC2 together:

 

Get started with Amazon SQS and Amazon EC2
Sample application to get started with Amazon SQS and Amazon EC2
SQS-EC2 Job Processor Sample AMI

-- Mike

High Performance Multithreaded Access to Amazon SimpleDB

Simpledb_s3_query_sample_2 We have just released a new code sample.

Written in Java, this new sample shows how Amazon SimpleDB can be used as a repository for metadata which describes objects stored in Amazon S3. The code was written to illustrate best practices for indexing S3 data and for getting the best indexing and query performance from SimpleDB.

Indexing is implemented at two levels. At the first level, multiple threads (implemented using the Java Executor) are used to ensure that a number of S3 reads and a number of SimpleDB writes are taking place simultaneously. At the second level, Amazon SQS is used to coordinate index tasks running on multiple systems, leading to an even higher degree of concurrency.

Bulk queries are implemented using a pair of thread pools. The first pool runs SimpleDB queries and the second retrieves SimpleDB attributes. With the proper balance between the two pools, a Small Amazon EC2 instance was able to make over 300 requests per second.

Check it out!

-- Jeff;

Amazon S3 Copy API Ready for Testing

Copying_s3_objects A few weeks ago we asked our developer community for feedback on a proposed Copy feature for Amazon S3. The feedback was both voluminous and helpful to us as we finalized our plans and designed our implementation.

This feature is now available for beta use; you can find full documentation here (be sure to follow the links to the detailed information on the use of this feature via SOAP and REST). Copy requests are billed at the same rate as PUT requests: $.01 for 1000 in the US, and $.012 for 1000 in Europe.

In addition to the obvious use for this feature -- creating a new S3 object from an existing one -- you can also use it to rename an object within a bucket or to move an object to a new bucket. You can also update the metadata for an object by copying it to itself while supplying new metadata.

Still on the drawing board is support for copying between US and Europe, and a possible conditional copy feature. Both of these items surfaced as a result of developer feedback.

Tool and library support for this new feature is already starting to appear; read more about that in this discussion board thread.

-- Jeff;

On Condor and Grids

There is lots of buzz about Hadoop and Amazon EC2—and of course there should be, given all the great projects such as the one that the New York Times one, where they converted old articles into PDF files in short order at a very reasonable cost.

There’s a second environment you should know about, although the buzz level is a bit lower. (That might change.) Condor is a scheduling application that is commonly used in HPC and grid applications. It can also be used to manage Hadoop grids, and manages “jobs” in much the same manner as mainframes—that is, you submit a job to Condor, along with metadata that describes the job’s characteristics. Then Condor finds suitable resources to allocate for the job. Note that Condor and Hadoop are trying to solve things in independent ways--with the result that they overlap in some ways, while doing unrelated things in some cases.

This week I attended Condor Week at the University of Wisconsin in Madison. Condor Week is an annual event that gives Condor collaborators and users the chance to exchange ideas and experiences, to learn about latest research, to experience live demos, and to influence our short and long term research and development directions.

If you are interested in large-scale grid computing, this approach is worth a serious look. There are two active projects that implement Condor on Amazon EC2, and of course that’s why this blog entry is being posted.

Cycle Computing offers Amazon EC2 plus Condor as an integrated platform, in addition to supporting other underlying computing resources. Their software automates Condor grid management, including monitoring, configuration, version control, usage tracking, and more. At the conference Jason Stowe from Cycle Computing made a very strong case for using Amazon EC2 instead of a traditional grid environment. Jason’s presentation is available for download at http://www.cs.wisc.edu/condor/CondorWeek2008/condor_presentations/stowe_cycle.pdf.

Red Hat’s approach integrates EC2 directly into the Condor code base. The result is that an Amazon EC2 instance is the “Condor Job”, and in that manner they are able to manage the entire life cycle of an EC2 Instance. In some cases the entire Condor pool is running on EC2, and in other cases EC2 augments an existing pool. All of this work was done by collaboration between the University of Wisconsin (Jaeyoung Yoon , Fang Cao, and Jaime Frey, along with Matt Farrellee from Red Hat. They plan to integrate Amazon S3 as a storage medium in the near future.

One thing seems certain: on-demand virtualization brightens the lights in Grid Computing City, because organizations who could not afford a grid suddenly find themselves with both affordable infrastructure and powerful tools to manage their new-found tool.

-- Mike

July 2008

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31