Library Journal "Digital Libraries" Columns 1997-2007, Roy Tennant
Please note: these columns are an archive of my Library Journal column from 1997-2007. They have not been altered in content, so please keep in mind some of this content will be out of date.
MARC Must Die
When MARC was created, the Beatles were a hot new group and those of us alive at the time wore really embarrassing clothes and hairstyles. Computers were so large, complex, and expensive that it was ludicrous to think that you would one day have one in your home, let alone hold one in the palm of your hand. Although age by itself is not necessarily a sign of technological obsolescence (how much has the wooden pencil improved in the last 40 years?), when it comes to computer standards it is generally not a good thing.
The very nature of the MARC (machine-readable cataloging) record is, to some degree, an anachronism. It was developed in an age when memory, storage, and processing power were all rare and expensive commodities. Now they are ubiquitous and cheap.
Just look at a raw MARC record to see what I mean. But don't get too much of a headache. There are only two kinds of people who believe themselves able to read a MARC record without referring to a stack of manuals: a handful of our top catalogers and those on serious drugs. In MARC, fields are not explicitly labeled but coded with a numbering scheme that cannot be read by someone unfamiliar with the complicated syntax. But needless obfuscation, as annoying as it may be, is not the real nature of our emerging difficulty with MARC.
The problems with MARC are serious and extensive, which is why a number of us are increasingly convinced that MARC has outlived its usefulness. See, for example, Dick Miller's slides, 'XML and MARC: A Choice or Replacement?' 'The rigidity and internal irregularities of MARC are beginning to create problems for catalogers and users,' says David Flanders in the article 'Applying XML to the Bibliographic Description.' He continues, 'MARC is beginning to lag behind current research into bibliographic description standards.'
When I refer to MARC in this column, I am conflating several interrelated things. There are the MARC syntax, the MARC data elements, and the Anglo-American Cataloging Rules (AACR). These pieces are so intertwined that teasing out which must be jettisoned and which can be kept will be at least as difficult as starting from scratch. In next month's column, I will take a closer look at this issue by addressing MARC exit strategies. Meanwhile, let's examine some specific problems with this confluence of standards that I'm calling MARC.
Although MARC is a complex standard, it lacks essential checks and balances to assure that appropriate granularity-how finely the individual elements are chopped-is achieved when coding a record. For example, the editor of a book should be encoded in a 700 field, with a $e subfield that specifies the person is the editor. But the $e subfield is frequently not encoded, thus leaving one to guess the role of the person encoded in the 700 field.
In many cases, it is only by reference to the title field, of all things, that one can discover the identity of the editor. This can happen because MARC and AACR2 are largely focused on capturing the paper catalog card in computer form. In the title field you would think you would find only the title of the book. But there are strange appendages, such as 'edited and with an introduction by Peter Green' stuffed into one of a series of subfields. Peter Green's having edited the book should not be buried in a text string in a subfield of the title field. A more egregious example is the ambiguous encoding of respective parts of a personal name (last name, first name, etc.).
Extensibility and language
Migrating our catalogs from printed cards to computers was a massive job, now largely completed. A number of libraries are now enriching those records with additional information, such as the table of contents. Although it is possible to smash the table of contents into a MARC record (see Blackwell's description of its Tables of Contents Enrichment Service for the gory details), it's not pretty. By its very nature, MARC is flat, whereas a table of contents is hierarchical. This would be a breeze in XML. (For more information on translating MARC into XML, see the XMLMARC web site.)
I cannot even imagine where in the MARC record we would put a book cover graphic (or book jacket information or reviews) in a way that would make this sort of information both easily available to those who need it and easily ignored by those who don't.
While MARC offers at least some facility for dealing with multiple scripts- such as a book title in Chinese and a translated or transliterated title-it handles them in such a way that it can be difficult for them to be properly processed by software. For example, relationships among related titles are problematic in MARC. More information about these problems can be found at the Moving from MARC to XML site.
MARC has always been an arcane standard. No other profession uses MARC or anything like it. When we shop around for software to handle such records, we are limited to the niche market of library vendors. For their part, vendors must design systems that can both take in and output records in MARC format.
Meanwhile, the wider information technology industry is moving wholesale to XML as a means to encode and transfer information. Such a movement doesn't mean we abandon our existing systems for any XML-aware search tool. But if we redesign our bibliographic record standard to use XML, the vendors will likely find it both easier and cheaper to produce the products we require.
The real reason
Libraries exist to serve the present and future needs of a community of users. To do this well, they need to use the very best that technology has to offer. With the advent of the web, XML, portable computing, and other technological advances, libraries can become flexible, responsive organizations that serve their users in exciting new ways. Or not. If libraries cling to outdated standards, they will find it increasingly difficult to serve their clients as they expect and deserve.
To create standards that are both adequate for present needs and flexible enough to offer new opportunities, we should begin with the requirements of bibliographic description (see Functional Requirements for Bibliographic Records, for example) and devise an encoding standard that provides power and flexibility. This is clearly a huge undertaking and one that will take commitment from organizations such as the Library of Congress and OCLC. We did it once over 30 years ago, and we can do it again. MARC may have been born in the Beatles era, but it is time to show it the long and winding road.
David Fiander, 'Applying XML to the Bibliographic Description,' Cataloging & Classification, 33 (2) 2001, p. 17-28. Functional Requirements for Bibliographic Records www.ifla.org/VII/s13/frbr/frbr.htm MARC Standards www.loc.gov/marc Moving from MARC to XML ihome.ust.hk/~lblkt /xml/marc2xml.html Tables of Contents Enrichment Service www.blackwell.com /pdf/TOCEnrichment.pdf XML and MARC: A Choice or Replacement? elane.stanford.edu/laneauth /ALAChicago2000.html XMLMARC xmlmarc.stanford.edu