We're starting to wire various AWS services to each other, with interesting and powerful results. Today I'd like to talk to you about a brand new connection between Amazon S3 and the Amazon Simple Notification Service.
When I introduced you to SNS earlier this year I noted that "SNS is also integrated with other AWS services" and said that you could arrange to deliver notifications to an SQS message queue.
We're now ready to take that integration to a new level. Various parts of AWS will now start to publish messages to an SNS topic to let your application know that a certain type of event has occurred. The first such integration is with Amazon S3, and more specifically, with S3's new Reduced Redundancy Storage option.
You can now configure any of your S3 buckets to publish a message to an SNS topic of your creation (permissions permitting) when S3 detects that it has lost an object that was stored in the bucket using the RRS option.Your application can subscribe to the topic and (when the event is triggered) respond by regenerating the object and storing it back in S3. The message will include the event, a timestamp, the name of the bucket, the object's key and version id, and some internal identifiers.
Let's say that you are using S3 to store an original image and some derived images. You would use the STANDARD storage class for the original image and the REDUCED_REDUNDANCY storage class for the derived images. You would also need to store the information needed to regenerate a derived image from the original image. You could store this in SimpleDB or you could create a naming convention for your S3 object keys and then extract the needed information from the URL.
Consider this image:
It is the original image and would be stored with the STANDARD storage class. Derived images (scaled to a new size in this case) would use a suffix containing the needed information, and would be stored with REDUCED_REDUNDANCY:
A notification would be stored on the faces bucket and routed to a topic such as faces_web_app_errors. Your application need only await events on the topic and respond as follows:
- Confirm the event is of the expected type (s3:ReducedRedundancyLostObject)
- Extract the bucket and key name from the event
- Parse the key name to identify the key of the original object and the transform to be applied
- Fetch the original object
- Apply the transform (image scaling in this case)
- Store the derived object in S3 using the REDUCED_REDUNDACY storage class
Over time, we'll wire up additional events (for S3 and for other services) to SNS. You can prepare for this now by creating general purpose event handlers in your application, and by keeping your code properly factored so that it is easy to create an object when needed. For the case listed above, I would think about structuring my application so that the only way to create a derived object is in response to an event. I would then generate synthetic "lost" events and use them to materialized the derived objects for the first time.
-- Jeff;


That's great!
We want EC2 Events next please :)
Regards,
Thomas
http://scalarium.com
Posted by: Thomas | July 15, 2010 at 12:34 AM
I second "EC2 events"! Might save AWS money to get rid of all the pollers out there, too :-)
Other events that would be really useful are an event per dollar spent per instance on bandwidth. Or EBS I/O operations (or even an API to get this information).
Thanks
Posted by: Tim Freeman | July 15, 2010 at 08:28 AM
Jeff, this is a really cool addition to the RRS service.
Can you comment on how soon after an object is really lost the SNS notification will be sent? I'm interested to know if that latency is small enough to practically ignore the possibility that someone else will request the object before I've been notified of its loss.
Posted by: Shlomo Swidler | July 21, 2010 at 03:10 PM
I would like to see a more full-blown events API for S3, along the lines of inotify (http://en.wikipedia.org/wiki/Inotify).
I'd like to know when any object in a given S3 bucket has been added, modified, deleted, etc.
Is this possible aside from polling?
Posted by: Dan Tenenbaum | September 27, 2010 at 10:09 PM