[XML4Lib] argg, namespaces

Erik Hetzner erik.hetzner at ucop.edu
Sat Jul 14 13:55:55 EDT 2007


At Sat, 14 Jul 2007 13:39:55 -0400,
Eric Lease Morgan <emorgan at nd.edu> wrote:
>
>
> Argg! Namespaces, a necessary evil.
>
> Seriously, how to I write a set of XPath statments to parse an XML
> file that contains a namespace?
>
> I have the following MODS file, and it includes an un-prefixed
> namespace (http://www.loc.gov/mods/v3):
>
> […]

> Then, using Perl, I create a LibXML parser, parse the file ($input)
> creating an object, and try to loop through all of the mods elements:
>
>    $parser = XML::LibXML->new;
>    $collection = $parser->parse_file( $input );
>    foreach my $mods ( $collection->findnodes( '//mods' )) {
>
>      my $titles = '';
>      foreach $node ( $mods->findnodes( './/titleInfo/title' )) {
>        $titles .= $node->textContent . '|'
>      }
>
>      # do more cool stuff here
>
>    }
>
> As written my Perl script never enters the foreach loop, but as soon
> as I remove the namespace declaration from the MODS file the script
> works just fine.
>
> How do I specify the namespace in my findnodes method?

Writing the xpath is easy:

.//mods:titleInfo/mods:title

The trick is binding the mods prefix to <http://www.loc.gov/mods/v3>
in the Xpath ‘context’. It looks like with libxml you have to declare
a context object & use it to wrap your nodes. I haven’t tried this
code; for more info, see
<http://search.cpan.org/dist/XML-LibXML/lib/XML/LibXML/XPathContext.pod>

$parser = XML::LibXML->new;
$collection = $parser->parse_file( $input );
my $xc = XML::LibXML::XPathContext->new($collection);
$xc->registerNs('mods', 'http://www.loc.gov/mods/v3');
foreach my $mods ( $xc->findnodes( '//mods:mods' )) {
  […]
}

best,
Erik Hetzner
;; Erik Hetzner, California Digital Library
;; gnupg key id: 1024D/01DB07E3
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.webjunction.org/wjlists/xml4lib/attachments/20070714/87238fb5/attachment.bin


More information about the XML4Lib mailing list