[XML4Lib] JOB: Web Archive Programmer, CDL

Roy Tennant roy.tennant at ucop.edu
Tue Jun 21 11:48:08 EDT 2005

Web Archive Programmer
California Digital Library
Oakland, CA
Job # 2323-05
CLOSE DATE:  June 30, 2005
3 year contract position

RESPONSIBILITIES:  The Web Archive programmer work will focus on a 
national project developing capacity to capture and preserve web sites. 
The focus will be on the identification, analysis, reporting on and 
development of tools for the harvest and analysis of web-based content: 
the crawlers, creation of digital objects from the crawled sites, 
analysis, and presentation (viewing) of harvested content. This 
includes recommending the purchase and/or modification of third party 
software or the development of new software using Java, Perl and/or 
Python programming languages.

• Bachelors degree or equivalent in an appropriate area such as library 
and information science or computer science.
• 3 years experience developing software in production environments, 
web protocols (HTTP, SOAP, LDAP, etc.), and common web formats (HTML, 
PDF, GIF, etc).
• Experience working with and communicating with diverse staff 
including technical and non-technical staff teams.
• High level proficiency and at least 2 years experience in the Java, 
Perl and/or Python programming language.
• Proficiency in XML, XSLT and CSS.
• Demonstrated ability to review, assess, and communicate findings 
related to software evaluation (evaluate reasonable alternatives, 
translate findings into recommended changes, actions or strategies.
• Excellent analytical, written and oral communication skills.
• Demonstrated ability to track, organize and prioritize workload and 
request resources and information needed to do the job.
• Demonstrated flexibility in accommodating changing priorities.

• Knowledge of the Internet Archive’s web archive format, and knowledge 
of web harvesters (e.g. HTTrack, Heritrix, Curl).
• Knowledge of web browser plug-in and Mozilla/Firefox programming.
• Knowledge of digital repository architectures (e.g. OAIS) and 
association of content with persistent identifiers (e.g. PURLs, ARKs).
• Knowledge of digital library standards for description, transmission 
and access (e.g., METS, OAI, HTML, Dublin Core, TEI, SGML, MARC).

ABOUT:  The California Digital Library (CDL) is the 11th university 
library of the University of California. It was established in 1997 to 
build the university’s digital library, to encourage campus libraries 
to share their resources and holdings more effectively, and to provide 
leadership in the application of information technology to the 
development of UC’s library collections and services. Its mission is to 
harness technology and innovation, and leverage the intellectual and 
cultural resources of the University of California to support the 
assembly and creative use of the world’s scholarship and knowledge for 
the University of California Libraries and the communities they serve.

To apply, go to:
Search for job # 2323-05

For further information about the CDL:

More information about the XML4Lib mailing list