[XML4Lib] OAIster reaches 10 million records

Perry Willett pwillett at umich.edu
Thu Jan 25 12:55:41 EST 2007

ANN ARBOR, Mich. - OAIster Reaches 10 Million Records. 
We live in an information-driven world-- one in which access to good 
information defines success. OAIster's growth to 10 million records 
takes us one step closer to that goal.

Developed at the University of Michigan's Library, OAIster is a 
collection of digital scholarly resources. OAIster is also a service 
that continually gathers these digital resources to remain complete 
and fresh. As global digital repositories grow, so do OAIster's 

Popular search engines don't have the holdings OAIster does. They
crawl web pages and index the words on those pages. It's an 
outstanding technique for fast, broad information from public 
websites. But scholarly information, the kind researchers use to 
enrich their work, is generally hidden from these search engines.

OAIster retrieves these otherwise elusive resources by tapping 
directly into the collections of a variety of institutions using 
harvesting technology based on the Open Archives Initiative (OAI) 
Protocol for Metadata Harvesting. These can be images, academic 
papers, movies and audio files, technical reports, books, as well as 
preprints (unpublished works that have not yet been peer reviewed).  
By aggregating these resources, OAIster makes it possible to search 
across all of them and return the results of a thorough investigation 
of complete, up-to-date resources.

Ann Devenish, Publication Services Project Manager at Woods Hole 
Oceanographic Institute, notes that "Harvesting by OAIster is a 
primary 'selling point' when we talk to scientists and researchers 
about the visibility, accessibility, and impact of their 
contributions in an institutional repository. From their own 
experiences they know that a search using one of the popular search 
engines can bring back thousands (if not, millions) of results which 
will require careful and time-consuming screening, with no guarantee 
that they will ever get to the content they seek. A search of 
OAIster, across hundreds of open and scholarly archives and millions 
of records, brings back results with the key metadata elements that 
allow for quick identification of, and easy navigation to, the 
content they seek."

OAIster is good news for the digital archives that contribute 
material to open-access repositories. "[OAIster has demonstrated 
that]...OAI interoperability can scale. This is good news for the 
technology, since the proliferation is bound to continue and even 
accelerate," says Peter Suber, author of the SPARC Open Access 
Newsletter. As open-access repositories proliferate, they will be 
supported by a single, well-managed, comprehensive, and useful tool.

Scholars will find that searching in OAIster can provide better 
results than searching in web search engines. Roy Tennant, User
Services Architect at the California Digital Library, offers an
example: "In OAIster I searched 'roma' and 'world war,' then sorted
by weighted relevance. The first hit nailed my topic-- the
persecution of the Roma in World War II. Trying 'roma world war'
in Google fails miserably because Google apparently searches 'Rome'
as well as 'Roma.' The ranking then makes anything about the Roma
people drop significantly, and there is nothing in the first few
screens of results that includes the word in the title, unlike the
OAIster hit."

OAIster currently harvests 730 repositories from 49 countries on 6 
continents. In three years, it has more than quadrupled in size and 
increased from 6.2 million to 10 million in the past year. OAIster 
is a project of the University of Michigan Digital Library Production 

For more information about University of Michigan's OAIster Project, 
visit http://www.oaister.org/, or contact Kat Hagedorn at 
khage at umich.edu.

Perry Willett
Head, Digital Library Production Service
300 Hatcher North
University of Michigan
Ann Arbor MI 48109-1205
Ph: 734-764-8074
Fax: 734-647-6897
Email: pwillett at umich.edu

More information about the XML4Lib mailing list