Recent AWS Customer Success Stories & Videos

More AWS Customer Success Stories...

« Virtual Stress-free Testing in the Cloud | Main | AWS Links - Wednesday, February 25, 2009 »


TrackBack URL for this entry:

Listed below are links to weblogs that reference New AWS Public Data Sets - Economics, DBpedia, Freebase, and Wikipedia:


Feed You can follow this conversation by subscribing to the comment feed for this post.


Quarter of a petabyte? Don't you mean quarter of a terabyte?

Deva Rajan

How often are these databases updated by AWS?
Or is this a single-point-in-time snapshot and the user is responsible for updating them from the original sources after the initial download to a EBS volume?



Thank you for offering these datasets. They have a potential to have a huge impact.

I recommend that there be public documentation on how often you plan to update these files. I asked how often they will be updated in the last blog post on this subject and also in the EC2 forum and have not received an answer.

I'll have to assume that you did not update the Bureau of Labor data. Given the decline in employment since the November dataset (a decline of ~600,000 employees for each month of Nov, Dec, and Jan or a 0.8% increase in the unemployment rate) it is a disservice to even offer this data if it will not be updated. I'm not using the data, but if I were I would have to download the files myself since they are not updated. Assume you have a student using this data, will they really get passing grades if their data ignores the most recent huge decline in employment?

Michael E Driscoll

Jeff - As a former NCBI employee, it's great to see Genbank in the mix here. I've got lots of questions about how the data is available -- for NCBI, is it ASCII Genbank files? or in ASN.1 (their underlying format)?

One of most valuable data sets that NCBI maintains Pubmed abstracts -- text abstracts for every article published in the life sciences for the last decade. It's data that isn't available anywhere else, and I hope it might be considered for inclusion.

I look forward to seeing some How-To's and working examples of using these data sets!


Account Deleted

Wouldn't it be great if all the medical record formats were also available as a public data set. S3 is the perfect place for a medical records hub and supports Obama's plan to automate medical records.

The comments to this entry are closed.

Featured Events

The AWS Report

Brought to You By

Jeff Barr (@jeffbarr):

Jinesh Varia (@jinman):

Email Subscription

Enter your email address:

Delivered by FeedBurner

April 2014

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30