Library Journal "Digital Libraries" Columns 1997-2007, Roy Tennant
Please note: these columns are an archive of my Library Journal column from 1997-2007. They have not been altered in content, so please keep in mind some of this content will be out of date.
The Role of Open Source Software
The open source software (OSS) movement has garnered headlines in a variety of computing journals, and Linux -- an open source operating system -- has even been touted as a competitor to Microsoft Windows. Also, in articles and speeches, librarians such as Yale's Daniel Chudnov have exhorted colleagues to embrace OSS. So what is all the fuss about? To understand OSS, you must first understand how software is created. Programs are written in a "high-level" language -- easy for humans to understand but not machines. For a computer to understand a computer program, it must first be translated into "machine code," which lays out the steps the computer must execute. Some programming languages, such as Perl, are interpreted into machine code the moment they are executed. Other languages must be compiled into binary or machine form before being executed. The compiling process also renders the program unreadable (and unchangeable). To alter a compiled program, you must return to the source, or high-level version of it. Open source software, then, is freely distributed in uncompiled form and can be easily read and altered. Commercial software is distributed only in compiled form, thereby preventing people from understanding how it was created. Distributing the source code of programs, then, is somewhat of a revolutionary act, or at the very least an altruistic one. For a more complete definition of open source software, see http://www.opensource.org. For more information on the OSS movement and librarians, see Chudnov's article "Open Source Software: The Future of Library Systems?" OSS and digital libraries Prototyping. Developing digital library collections and services often means creating new kinds of tools and services. Prototypes are an important part of the development process. Open source software contributes to prototype development by being free, as well as alterable to different specifications. This enables digital library developers to prototype new systems for very little up-front cost, which helps persuade funding sources to back a project. Production Services. Open source software also can be used for producing digital library collections and services, many of which already run on open source software. The popular Apache web server is but one example. Originally beginning as the National Center for Supercomputing Applications web server (NCSA also produced Mosaic, the first graphical web browser), the Apache project allowed programmers everywhere to embellish and enhance that free software package. Cooperative software development. The OSS model allows digital library developers to codevelop software solutions easily and openly. Commercial companies must by necessity be secretive in their projects. Tools and techniques developed by the open source movement (such as version control software) readily support collaboration among a diverse group of developers who may be physically far from one another. Cost. As OSS is distributed for free, if the particular application turns out not to solve your problem, you have only wasted your time. Also, since cost is not a factor, it is trivial to install and test out software before committing to use it for a particular project. Projects. The most ambitious library-based OSS development project is the Open Source Digital Library System, an effort to create a library automation system from scratch. As one might expect from a volunteer project that began less than a year ago, it has not progressed very far -- but that could change. Most OSS projects are not nearly as ambitious. But small applications can be used with other applications to create full-featured services. For example, some services on the Berkeley Digital Library SunSITE, including our very popular indexes to Internet resources, Librarians' Index to the Internet and KidsClick!, are based on an open source web site indexing package (SWISH-E) and in-house Perl scripts. Also, we use Hypermail, an application that receives incoming mail and creates a web-accessible archive, and SWISH-E together to provide full-featured browsing and searching of the large electronic discussion lists Web4Lib and PubLib. But unless the OSS application is a well-developed and stand-alone application such as the Apache web server, use of OSS will mostly occur in large libraries (of all types) that are more likely to have staff who can install and maintain the software. Resources. The best site for library-specific OSS is a web site and listserv hosted by Chudnov, "oss4lib." More general resources are available from http://www.opensource.org and the open source site from O'Reilly & Associates, a computer book publisher. LINK LIST Apache Web Server http://www.apache.org/ Hypermail http://www.landfield.com/hypermail Open Source Definition http://www.opensource.org/osd.html Open Source Digital Library System http://osdls.library.arizona.edu/ "Open Source Software: The Future of Library Systems?" LJ 8/99, p. 40-43 O'Reilly & Associates http://opensource.oreilly.com SWISH-E http://sunsite.berkeley.edu/SWISH-E/