The new version of ScienceResearch.com was officially launched today. Earlier this year I spoke with company president Abe Lederman and learned that this deep web science search engine provides a single point of access to over 400 publicly accessible science and technology collections.
The new version is hosted on EC2 and includes advanced search features such as relevance ranking, clustering (by topic, author, publication, or date), and exporting of search results in popular citation formats. The site indexes content in 15 different topic areas including agricultural sciences, astronomy & space, biology & nature, chemistry, computers & technology, defense technologies, earth & environmental sciences, energy, health & medicine, materials science, physics, and mathematics. It also indexes patents and science news.
Abe will be presenting a paper titled Journey to Ten Thousand Sources at the SLA (Special Libraries Association) conference this week. The paper contains a lot of interesting information about the architecture of this highly scalable federated search engine including details of the ways in which queries are partitioned into batches and then spread across multiple crawlers.