My Photo

« Virtual Stress-free Testing in the Cloud | Main | AWS Links - Wednesday, February 25, 2009 »

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341c534853ef0111689598c9970c

Listed below are links to weblogs that reference New AWS Public Data Sets - Economics, DBpedia, Freebase, and Wikipedia:

Comments

AaronSw

Quarter of a petabyte? Don't you mean quarter of a terabyte?

Deva Rajan

How often are these databases updated by AWS?
Or is this a single-point-in-time snapshot and the user is responsible for updating them from the original sources after the initial download to a EBS volume?

Thanks!

iolaire

Thank you for offering these datasets. They have a potential to have a huge impact.

I recommend that there be public documentation on how often you plan to update these files. I asked how often they will be updated in the last blog post on this subject and also in the EC2 forum and have not received an answer.

I'll have to assume that you did not update the Bureau of Labor data. Given the decline in employment since the November dataset (a decline of ~600,000 employees for each month of Nov, Dec, and Jan or a 0.8% increase in the unemployment rate) it is a disservice to even offer this data if it will not be updated. I'm not using the data, but if I were I would have to download the files myself since they are not updated. Assume you have a student using this data, will they really get passing grades if their data ignores the most recent huge decline in employment?

Michael E Driscoll

Jeff - As a former NCBI employee, it's great to see Genbank in the mix here. I've got lots of questions about how the data is available -- for NCBI, is it ASCII Genbank files? or in ASN.1 (their underlying format)?

One of most valuable data sets that NCBI maintains Pubmed abstracts -- text abstracts for every article published in the life sciences for the last decade. It's data that isn't available anywhere else, and I hope it might be considered for inclusion.

I look forward to seeing some How-To's and working examples of using these data sets!

MD

Trudy

Wouldn't it be great if all the medical record formats were also available as a public data set. S3 is the perfect place for a medical records hub and supports Obama's plan to automate medical records.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.

Email Subscription

Enter your email address:

Delivered by FeedBurner

December 2009

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31