We want to make it very easy for you to be able to store any amount of semistructured data and to be able to read, write, and modify it quickly, efficiently, and with predictable performance. We don't want you to have to worry about servers, disks, replication, failover, monitoring, software installation, configuration, or updating, hardware upgrades, network bandwidth, free space, sharding, rearchitecting, or a host of other things that will jump up and bite you at the worst possible time.
We want you to think big, to dream big dreams, and to envision (and then build) data-intensive applications that can scale from zero users up to tens or hundreds of millions of users before you know it. We want you to succeed, and we don't want your database to get in the way. Focus on your app and on building a user base, and leave the driving to us.
Sound good?
Hello, DynamoDB
Today we are introducing Amazon DynamoDB, our Internet-scale NoSQL database service. Built from the ground up to be efficient, scalable, and highly reliable, DynamoDB will let you store as much data as you want and to access it as often as you'd like, with predictable performance brought on by the use of Solid State Disk, better known as SSD.
DynamoDB works on the basis of provisioned throughput. When you create a DynamoDB table, you simply tell us how much read and write throughput you need. Behind the scenes we'll set things up so that we can meet your needs, while maintaining latency that's in the single-digit milliseconds. Later, if your needs change, you can simply turn the provisioned throughput dial up (or down) and we'll adjust accordingly. You can do this online, with no downtime and with no impact on the overall throughput. In other words, you can scale up even when your database is handling requests.
We've made DynamoDB ridiculously easy to use. Newly created tables will usually be ready to use within a minute or two. Once the table is ready, you simply start storing data (as much as you want) into it, paying only for the storage that you use (there's no need to pre-provision storage).Again, behind the scenes, we'll take care of provisioning adequate storage for you.
Each table must have a primary index. In this release, you can choose between two types of primary keys: Simple Hash Keys and Composite Hash Key with Range Keys.
- Simple Hash Keys give DynamoDB the Distributed Hash Table abstraction and are used to index on a unique key. The key is hashed over multiple processing and storage partitions to optimally distribute the workload.
- Composite Hash Keys with Range Keys give you the ability to create a primary key that is composed of two attributes -- a hash attribute and a range attribute. When you query against this type of key, the hash attribute must be uniquely matched but a range (low to high) can be specified for the range attribute. You can use this to run queries such as "all orders from Jeff in the last 24 hours."
Each item in a DynamoDB table consists of a set of key/value pairs. Each value can be a string, a number, a string set, or a number set. When you choose to retrieve (get) an item, you can choose between a strongly consistent read and an eventually consistent read based on your needs. The eventually consistent reads consume half as many resources, so there's a throughput consideration to think about.
Sounds great, you say, but what about reliability and data durability? Don't worry, we've got that covered too! When you create a DynamoDB table in a particular region, we'll synchronously replicate your data across servers in multiple zones. You'll never know about (or be affected by) hardware or facility failures. If something breaks, we'll get the data from another server.
I can't stress the operational performance of DynamoDB enough. You can start small (say 5 reads per second) and scale up to 50, 500, 5000, or even 50,000 reads per second. Again, online, and with no changes to your code. And (of course) you can do the same for writes. DynamoDB will grow with you, and it is not going to get between you and success.
As part of the AWS Free Usage Tier, you get 100 MB of free storage, 5 writes per second, and 10 strongly consistent reads per second (or 20 eventually consistent reads per second). Beyond that, pricing is based on how much throughput you provision and how much data you store. As is always the case with AWS, there's no charge for bandwidth between an EC2 instance and a DynamoDB table in the same Region.
You can create up to 256 tables, each provisioned for 10,000 reads and 10,000 writes per seconds. I cannot emphasize the next point strongly enough: We are ready, willing, and able to increase any of these values; simply click here and provide us with some additional information. Our early customers have, in several cases, already exceeded the default limits by an order of magnitude!
DynamoDB from the AWS Management Console
The AWS Management Console has a new DynamoDB tab. You can create a new table, provision the throughput, set up the index, and configure CloudWatch alarms with a few clicks:

You can enter your throughput requirements manually:

Or you can use the calculator embedded in the dialog:

You can easily set CloudWatch alarms that will fire when you are consuming more than a specified percentage of the throughput that you have provisioned for the table:

You can use the CloudWatch metrics to see when it is time to add additional read or write throughput:

You can easily increase or decrease the provisioned throughput:

Programming With DynamoDB
The AWS SDKs have been updated and now include complete support for DynamoDB. Here are some examples that I put together using the AWS SDK for PHP.
The first step is to include the SDK and create a reference object:
Creating a table requires three arguments: a table name, a key specification, and a throughput specification:
$Schema = array('HashKeyElement' =>
array('AttributeName' => 'RecordId',
'AttributeType' => AmazonDynamoDB::TYPE_STRING));
$Throughput = array('ReadsPerSecond' => 5, 'WritesPerSecond' => 5);
$Res = $DDB->create_table(array('TableName' => 'Sample',
'KeySchema' => $Schema,
'ProvisionedThroughput' => $Throughput));
After create_table returns, the table's status will be CREATING. It will transition to ACTIVE when the table is provisioned and ready to accept data. You can use the describe_table function to get the status and other information about the table:
Here's the result as a PHP object:
(
[CreationDateTime] => 1324673829.32
[ItemCount] => 0
[KeySchema] => CFSimpleXML Object
(
[HashKeyElement] => CFSimpleXML Object
(
[AttributeName] => RecordId
[AttributeType] => S
)
)
[ProvisionedThroughput] => CFSimpleXML Object
(
[ReadsPerSecond] => 5
[WritesPerSecond] => 5
)
[TableName] => Sample
[TableSizeBytes] => 0
[TableStatus] => ACTIVE
)
It is really easy to insert new items. You need to specify the data type of each item; here's how you do that (the other data type constants are TYPE_ARRAY_OF_STRINGS and TYPE_ARRAY_OF_NUMBERS):
{
print($i);
$Item = array('RecordId' => array(AmazonDynamoDB::TYPE_STRING => (string) $i),
'Square' => array(AmazonDynamoDB::TYPE_NUMBER => (string) ($i * $i)));
$Res = $DDB->put_item(array('TableName' => 'Sample', 'Item' => $Item));
}
Retrieval by the RecordId key is equally easy:
{
$Key = array('HashKeyElement' => array(AmazonDynamoDB::TYPE_STRING => (string) $i));
$Item = $DDB->get_item(array('TableName' => TABLE,
'Key' => $Key));
print_r($Item->body->Item);
}
Each returned item looks like this as a PHP object:
(
[RecordId] => CFSimpleXML Object
(
[S] => 44
)
[Square] => CFSimpleXML Object
(
[N] => 1936
)
)
The DynamoDB API also includes query and scan functions. The query function queries primary key attribute values and supports the use of comparison operators. The scan function scans the entire table with optional filtering of the results of the scan. Queries are generally more efficient than scans.
You can also update items, retrieve multiple items, delete items, or delete multiple items. DynamoDB includes conditional updates (to ensure that some other write hasn't occurred within a read/modify/write operation as well as atomic increment and decrement operations). Read more in the Amazon DynamoDB Developer Guide.
And there you have it, our first big release of 2012. I would enjoy hearing more about how you plan to put DynamoDB to use in your application. Please feel free to leave a comment on the blog.
-- Jeff;


Interesting that you have not mentioned SimpleDB, as there seems to be some overlap.
Posted by: ChrisHF | January 18, 2012 at 10:02 AM
Thanks for this writeup! I'm having trouble understanding how to create a unique attribute id for use as the hash key.
For example, let's say I'm storing comments. Coming from the SQL world, I just let the database auto-increment the unique CommentID. With Dynamo, how would I create a unique CommentID attribute for use as the hash key? Do I need to do that in the application code using php's uniqid function? That doesn't seem right...what am I missing? Thanks!
Posted by: Chris | January 18, 2012 at 10:36 AM
Well done! Do the growable read and write throughput parameters apply to consistent read and conditional writes as well?
Posted by: Phil Smith | January 18, 2012 at 11:08 AM
Phil - The throughput parameters apply to all reads and all writes.
Posted by: Jeff Barr | January 18, 2012 at 11:10 AM
Well SimpleDB was always in beta so I presume this is the successor to it.
In terms of SDKs the documentation doesn't mention Ruby. Do we have a timeline for this??
Posted by: Alexander Dimitriyadi | January 18, 2012 at 11:28 AM
Alexander, the AWS SDK for Ruby (http://aws.amazon.com/sdkforruby/) supports DynamoDB!
Posted by: Jeff Barr | January 18, 2012 at 11:29 AM
Ahh! there's a cool Christmas present! Except it's not available in my region. Aww. :(
I didn't even bother touching SimpleDB in the past because of the 10GB domain limit thing. However, the unlimited and seamless growth of this new service is definitely appealing.
Thanks,
JB
Posted by: JB | January 18, 2012 at 11:34 AM
This looks excellent. When can we have it in the EU region? :)
Best,
Ismael
Posted by: Ismael Juma | January 18, 2012 at 11:38 AM
I hope boto gets support soon!
Posted by: Rossk | January 18, 2012 at 12:13 PM
I, for one, am really excited about this announcement. I was dreading having to deal with MongoDB replica sets and all that noise. This is a good fit to couple of our apps. Keep 'em coming.
Posted by: Ruben Orduz | January 18, 2012 at 12:14 PM
Chris, there's no simple answer to this question. If you used PHP's uniqid function, you would have to add further randomness to insure that inserting two comments in close succession, and or from two different application servers, doesn't cause a collision.
You will, of course, want to choose a key that allows you to retrieve the comments later.
Posted by: Jeff Barr | January 18, 2012 at 12:26 PM
Does the free tier apply to every table created? Or is it 10 read units and 5 write units for the entire customer account?
Posted by: GK | January 18, 2012 at 01:01 PM
GK, those units are really per table. The DynamoDB detail page calls it out as follows:
"DynamoDB customers get 100 MB of free storage, as well 5 writes/second and 10 reads/second of ongoing throughput capacity."
Posted by: Jeff Barr | January 18, 2012 at 01:16 PM
Is DynamoDB open source or are there any plans to open source it?
Posted by: Ramon Salvadó | January 19, 2012 at 12:26 AM
Ramon - It is not open source, it is a web service. As far as I know there are no plans to open source it.
Posted by: Jeff Barr | January 19, 2012 at 10:00 AM
Rossk, support for DynamoDB in boto is currently in beta:
http://groups.google.com/group/boto-users/browse_thread/thread/ee30616da2d7d0a8
Posted by: Rayson Ho (Open Grid Scheduler) | January 19, 2012 at 11:14 AM
The DynamoDB looks great. Data storage management is something I want someone else to handle while I concentrate on the application. That said, the throughput pricing makes the initial storage costs costly on DynamoDB.
IF you have 7 tables to start with, you must pay for 5 Reads and Writes per table. That is 35 Reads and 35 Writes. 10 Reads costs $0.01/hour and 5 Writes cost $0.01 per hour. Which makes it kinda costly since a baby application will not need that 5 Writes per second.
A lower throughput option (say 2) would be so much better for experimental development when you do not have clients paying you. Well you could argue to put all data in one table, but that becomes ugly, we lose even basic data schema. I am still trying to figure out a detailed pricing for our application. Yours thoughts are welcome.
http://blog.brainless.in/2012/01/day-one-of-amazon-dynamodb.html
Posted by: Sumit Datta | January 19, 2012 at 09:12 PM
Sorry about my earlier comment. I had mentioned wrong Read and Write throughput rates. The actual rates are:
* Write Throughput: $0.01 per hour for every 10 units of Write Capacity
* Read Throughput: $0.01 per hour for every 50 units of Read Capacity
This pricing does look much better for experimental stuff. There are many libraries popping up on GitHub already. There will be frameworks supporting DyanmoDB in a couple of months I guess. It seems to be an exciting way ahead. Congrats again!
Posted by: Sumit Datta | January 20, 2012 at 06:27 PM
Hi, As per the FAQs:
{{ How long does it take to change the provisioned throughput level of a table?
In general, decreases in throughput will take anywhere from a few seconds to a few minutes, while increases in throughput will typically take anywhere from a few minutes to a few hours. We strongly recommend that you do not try and schedule increases in throughput to occur at almost the same time when that extra throughput is needed. We recommend provisioning throughput capacity sufficiently far in advance to ensure that it is there when you need it. }}
But someone here (http://ow.ly/8BcJg) mentions that it takes "about 1.5 minutes per GB when scaling up." And that implies, it could take days to scale up a database that's TBs in size?! Is it true? Please clarify.
Thanks.
Posted by: Aahan Krish | January 20, 2012 at 06:30 PM
Aaahan - The scale up factor that you quoted is not correct. We responded to this as follows in the DynamoDB forum (https://forums.aws.amazon.com/thread.jspa?threadID=85413):
"The overall time is not linear as the Google Groups poster suggests. In most cases it will be between a few minutes to a few hours regardless of total size. Larger data sets may take a bit longer than smaller data sets simply because there is often more data movement to perform and coordination to be made across a greater number of machines. Rest assured though, we make use of parallelism where we can so the curve is far from linear."
Posted by: Jeff Barr | January 21, 2012 at 08:00 AM
Hi
This is very good tutorial and it helps me a lot.
I need little help when i tried to create table i am getting following response in body section
UnrecognizedClientException
Will you please update me about this exception.
Regards
Posted by: Akbar Ali Butt | February 06, 2012 at 04:23 AM
Akbar, try the DynamoDB forum at https://forums.aws.amazon.com/forum.jspa?forumID=131&start=0 .
Posted by: Jeff Barr | February 06, 2012 at 05:28 AM