My Photo

« Vertica Webinar | Main | Napera Networks - Network Health Solution »

New York Times TimesMachine

Nyt_titanic_sinks_2 Derek Gottfrid and his colleagues at the New York Times have obviously been having a lot of fun with Amazon EC2.

Their latest offering is the TimesMachine. Print subscribers can access any issue of the New York Times, dating back to Volume 1, Number 1 in 1851. Non-subscribers can take a peek at 6 different (and historically significant) issues, including the inaugural edition, the end of World War I, and the sinking of the Titanic.

As they explained in their blog post, they used EC2, Hadoop, and some of their own code to convert 405,000 large TIFF images, 3.3 million SGML files, and 405,000 XML files to 810,000 PNG images and 405,000 JavaScript files. This didn't take all that long:

"By leveraging the power of AWS and Hadoop, we were able to utilize hundreds of machines concurrently and process all the data in less than 36 hours."

The content itself is really interesting, but I also enjoyed the fact that it was possible to see the articles in the context of the other issues of the day. The advertising is also interesting.

Robert Scoble has more coverage, including a video interview with Derek.

-- Jeff;

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341c534853ef00e5526fbaa98833

Listed below are links to weblogs that reference New York Times TimesMachine:

Comments

While no trivial task, can you imagine how completely awesome it would be if New York Times could now convert their consolidated set of information back into a format that they could store within the content management service that drives their site; that would be one incredible source of historic information.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.

Email Subscription

Enter your email address:

Delivered by FeedBurner

July 2009

Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31