<?xml version="1.0" encoding="UTF-8"?> <book lang="en-us"> <title>XML::LibXML</title> <bookinfo> <authorgroup> <author> <firstname>Matt</firstname> <surname>Sergeant</surname> </author> <author> <firstname>Christian</firstname> <surname>Glahn</surname> </author> </authorgroup> <edition>1.58</edition> <copyright> <year>2001-2004</year> <holder>AxKit.com Ltd; 2002-2004 Christian Glahn</holder> </copyright> </bookinfo> <chapter> <title>Introduction</title> <titleabbrev>README</titleabbrev> <para>This module implements a Perl interface to the Gnome libxml2 library. The libxml2 libxml2 library provides interfaces for parsing and manipulating XML Files. This Module allows Perl programmers to make use of the highly capable validating XML parser and the high performance DOM implementation.</para> <sect1> <title>Important Notes</title> <para>XML::LibXML was almost entirely reimplemented between version 1.40 to version 1.49. This may cause problems on some production machines. With version 1.50 a lot of compatibility fixes were applied, so programs written for XML::LibXML 1.40 or less should run with version 1.50 again.</para> </sect1> <sect1> <title>Dependencies</title> <para>Prior to installation you MUST have installed the libxml2 library. You can get the latest libxml2 version from</para> <para>http://xmlsoft.org</para> <para>Without libxml2 installed this module will neither build nor run.</para> <para>Also XML::LibXML requires the following packages:</para> <itemizedlist> <listitem> <para>XML::LibXML::Common - general functions used by various XML::LibXML modules</para> </listitem> <listitem> <para>XML::SAX - DOM building support from SAX</para> </listitem> <listitem> <para>XML::NamespaceSupport - DOM building support from SAX</para> </listitem> </itemizedlist> <para>These packages are required. If one is missing some tests will fail.</para> <para>Again, libxml2 is required to make XML::LibXML work. The library is not just requiered to build XML::LibXML, it has to be accessible during runtime as well. Because of this you need to make sure libxml2 is installed properly. To test this, run the xmllint program on your system. xmllint is shipped with libxml2 and therefore should be available.</para> </sect1> <sect1> <title>Installation</title> <para>To install XML::LibXML just follow the standard installation routine for Perl modules:</para> <orderedlist> <listitem> <para>perl Makefile.PL</para> </listitem> <listitem> <para>make</para> </listitem> <listitem> <para>make test</para> </listitem> <listitem> <para>make install # as superuser</para> </listitem> </orderedlist> <para>Note that you have to rebuild XML::LibXML once you upgrade libxml2. This avoids problems with binary incompatibilities between releases of the library.</para> <sect2> <title>Notes On libxml2 Versions</title> <para>libxml2 claims binary compatibility between its patch levels. This is not all true:</para> <para>First of all XML::LibXML requires at least libxml2 2.4.25. For most OS this means that an update of the prebuild packages is required, since most distributors ship ancient libxml2 versions most users will need to upgrade their installation.</para> <para>If you already run an older version of XML::LibXML and you wish to upgrade to a bug fixed version of libxml2. libxml2 2.4.25 and 2.5.x versions are not 100% binary compatible. So if you intend to upgrade to such a version you will need to rebuild XML::LibXML (and XML::LibXML::Common) as well.</para> <para>Users of perl 5.005_03 and perl 5.6.1 with thread support will also like to avoid libxml2 version 2.4.25 and use later versions instead.</para> <para>If your libxml2 installation is not within your $PATH. you can set the environment variable XMLPREFIX=$YOURLIBXMLPREFIX to make XML::LibXML recognize the correct libxml2 version in use.</para> <para>e.g.</para> <programlisting> perl Makefile.PL XMLPREFIX=/usr/brand-new </programlisting> <para>will ask '/usr/brand-new/bin/xml2-config' about your real libxml2 configuration.</para> <para>Try to avoid to set INC and LIBS on the commandline. One will skip the configuration tests in these cases. There will be no report, if the given installation is known to be broken.</para> </sect2> <sect2> <title>Which Version of libxml2 should be used?</title> <para>XML::LibXML is tested against many versions of libxml2 before it is released. Thus there are versions of libxml2 that are known not to work properly with XML::LibXML. The Makefile.PL keeps a blacklist of these broken libxml2 versions.</para> <para>If one has one of these versions it will be notified during installation. One may find that XML::LibXML builds and tests fine in a particular environment. But if XML::LibXML is run in such an environment, there will be no support at all!</para> <para>The following versions are tested:</para> <itemizedlist> <listitem> <para>past 2.4.20: tested; working.</para> </listitem> <listitem> <para>2.4.25: tested; not working</para> </listitem> <listitem> <para>past 2.4.25: tested, working</para> </listitem> <listitem> <para>past 2.5.0: tested; brocken Attribute handling</para> </listitem> <listitem> <para>version 2.5.5: tested; tests pass, but known as broken</para> </listitem> <listitem> <para>up to version 2.5.11: tested; working</para> </listitem> <listitem> <para>version 2.6.0: tested; not working</para> </listitem> <listitem> <para>to version 2.6.2: tested; working</para> </listitem> <listitem> <para>version 2.6.3: tested; not working</para> </listitem> <listitem> <para>version 2.6.4: tested; not working (XML Schema errors)</para> </listitem> <listitem> <para>version 2.6.5: tested; not working (broken XIncludes)</para> </listitem> <listitem> <para>up to version 2.6.8: tested; working</para> </listitem> </itemizedlist> <para>It happens, that an older version of libxml2 passes all tests under certain conditions. This is no reason to assume that version to work on all platforms. If versions of libxml2 are marked as not working this is done for good reasons.</para> </sect2> <sect2> <title>Notes for Microsoft Windows</title> <para>Thanks to Randy Kobes there is a precompiled PPM package available on</para> <para>http://theoryx5.uwinnipeg.ca/ppmpackages/</para> <para>Usually it takes a little time to build the package for the latest release.</para> </sect2> <sect2> <title>Notes for Mac OS X</title> <para>Due refactoring the module, XML::LibXML will not run with Mac OS X anymore. It appears this is related to special linker options for that OS prior to version 10.2.2. Since I don't have full access to this OS, help/ patches from OS X gurus are highly apprecheated.</para> <para>It is confirmed that XML::LibXML builds and runs without problems since Mac OS X 10.2.6.</para> </sect2> <sect2> <title>Notes for HPUX</title> <para>XML::LibXML requires libxml2 2.4.25 or later. That means there may not exist a usable binary libxml2 package for HPUX and XML::LibXML. For some reasons the HPUX cc will not compile libxml2 correctly, which will force you to recompile perl with gcc (if you havn't already done that).</para> <para>Additionally I received the following Note from Rozi Kovesdi:</para> <programlisting>Here is my report if someone else runs into the same problem: Finally I am done with installing all the libraries and XML Perl modules The combination that worked best for me was: gcc GNU make Most importantly - before trying to install Perl modules that depend on libxml2: must set SHLIB_PATH to include the path to libxml2 shared library assuming that you used the default: export SHLIB=/usr/local/lib also, make sure that the config files have execute permission: /usr/local/bin/xml2-config /usr/local/bin/xslt-config they did not have +x after they were installed by 'make install' and it took me a while to realize that this was my problem or one can use: perl Makefile.PL LIBS='-L/path/to/lib' INC='-I/path/to/include'</programlisting> </sect2> </sect1> <sect1> <title>Contact</title> <para>For suggestions etc. you may contact the maintainer directly <email>christian.glahn@uibk.ac.at</email></para> <para>For bug reports, please use the CPAN request tracker on http://rt.cpan.org/NoAuth/Bugs.html?Dist=XML-LibXML</para> <para>Also XML::LibXML issues are discussed among other things on the perl XML mailing list (<email>perl-xml@listserv.ActiveState.com</email>). In case of problems you should check the archives of that list first. Many problems are already discussed there. You can find the list's archives at http://mailarchive.activestate.com/browse/perl-xml/</para> </sect1> <sect1> <title>Package History</title> <para>Version < 0.98 were maintained by Matt Sergeant</para> <para>0.98 > Version > 1.49 were maintained by Matt Sergeant and Christian Glahn</para> <para>Versions >= 1.49 are maintained by Christian Glahn</para> <para>Versions > 1.56 are co-maintained by Petr Pajas</para> </sect1> <sect1> <title>Patches and Developer Version</title> <para>As XML::LibXML is open source software help and patches are appreciated. If you find a bug in the current release, make sure this bug still exists in the developer version of XML::LibXML. This version can be downloaded from cvs. The cvs version can be be loaded via</para> <para>cvs -d:pserver:anonymous@axkit.org:/home/cvs -z3 co XML-LibXML</para> <para>Note this account does not allow direct commits.</para> <para>Please consider the tests as correct. If any test fails it is most certainly related to a bug.</para> <para>If you find documentation bugs, please fix them in the libxml.dkb file, stored in the docs directory.</para> </sect1> <sect1> <title>Known Issues</title> <para>The push-parser implementation causes memory leaks.</para> </sect1> </chapter> <chapter> <title>License</title> <titleabbrev>LICENSE</titleabbrev> <para>This is free software, you may use it and distribute it under the same terms as Perl itself.</para> <para>Copyright 2001-2003 AxKit.com Ltd, All rights reserved.</para> <sect1> <title>Disclaimer</title> <para>THIS PROGRAM IS DISTRIBUTED IN THE HOPE THAT IT WILL BE USEFUL, BUT WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.</para> </sect1> </chapter> <chapter> <title>Perl Binding for libxml2</title> <titleabbrev>XML::LibXML</titleabbrev> <sect1> <title>Synopsis</title> <programlisting>use XML::LibXML; my $parser = XML::LibXML->new(); my $doc = $parser->parse_string(<<'EOT'); <some-xml/> EOT</programlisting> </sect1> <sect1> <title>Description</title> <para>This module is an interface to the gnome libxml2 DOM and SAX parser and the DOM tree. It also provides an XML::XPath-like findnodes() interface, providing access to the XPath API in libxml2. The module is split into several packages which are not described in this section.</para> <para>For further information, please check the following documentation:</para> <variablelist> <varlistentry> <term>XML::LibXML::Parser</term> <listitem> <para>Parsing XML Files with XML::LibXML</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::DOM</term> <listitem> <para>XML::LibXML DOM Implementation</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::SAX</term> <listitem> <para>XML::LibXML direct SAX parser</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::Document</term> <listitem> <para>XML::LibXML DOM Document Class</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::Node</term> <listitem> <para>Abstract Base Class of XML::LibXML Nodes</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::Element</term> <listitem> <para>XML::LibXML Class for Element Nodes</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::Text</term> <listitem> <para>XML::LibXML Class for Text Nodes</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::Comment</term> <listitem> <para>XML::LibXML Comment Nodes</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::CDATASection</term> <listitem> <para>XML::LibXML Class for CDATA Sections</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::Attr</term> <listitem> <para>XML::LibXML Attribute Class</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::DocumentFragment</term> <listitem> <para>XML::LibXML's DOM L2 Document Fragment Implementation</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::Namespace</term> <listitem> <para>XML::LibXML Namespace Implementation</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::PI</term> <listitem> <para>XML::LibXML Processing Instructions</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::Dtd</term> <listitem> <para>XML::LibXML DTD Support</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::RelaxNG</term> <listitem> <para>XML::LibXML frontend for RelaxNG schema validation</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXMLguts</term> <listitem> <para>Internal of the Perl Layer for libxml2 (not done yet)</para> </listitem> </varlistentry> </variablelist> </sect1> <sect1> <title>Version Information</title> <para>Sometimes it is usefull to figure out, for which version XML::LibXML was compiled for. In most cases this is for debugging or to check if a given installation meets all functionality for the package. The functiones XML::LibXML::LIBXML_DOTTED_VERSION and XML::LibXML::LIBXML_VERSION provide this version information. Both functions simply pass through the values of the similar named macros of libxml2.</para> <variablelist> <varlistentry> <term>XML::LibXML::LIBXML_DOTTED_VERSION</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$Version_String = XML::LibXML::LIBXML_DOTTED_VERSION;</funcsynopsisinfo> </funcsynopsis> <para>Returns the Versionstring of the libxml2 version XML::LibXML was compiled for. This will be "2.6.2" for "libxml2 2.6.2".</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::LIBXML_VERSION</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$Version_ID = XML::LibXML::LIBXML_VERSION;</funcsynopsisinfo> </funcsynopsis> <para>Returns the version id of the libxml2 version XML::LibXML was compiled for. This will be "20602" for "libxml2 2.6.2". Don't mix this version id with $XML::LibXML::VERSION. The latter contains the version of XML::LibXML itself while the first contains the version of libxml2 XML::LibXML was compiled for.</para> </listitem> </varlistentry> </variablelist> </sect1> <sect1> <title>Related Modules</title> <para>The modules described in this section are not part of the XML::LibXML package itself. As they support some additional features, they are mentioned here.</para> <variablelist> <varlistentry> <term>XML::LibXSLT</term> <listitem> <para>XSLT Processor using libxslt and XML::LibXML</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::Common</term> <listitem> <para>Common functions for XML::LibXML related Classes</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::Iterator</term> <listitem> <para>XML::LibXML Implementation of the DOM Traversal Specification</para> </listitem> </varlistentry> <varlistentry> <term>XML::LibXML::XPathContext</term> <listitem> <para>Advanced XPath processing using libxml2 and XML::LibXML</para> </listitem> </varlistentry> </variablelist> </sect1> <sect1> <title>XML::LibXML and XML::GDOME</title> <para>Note: <emphasis>THE FUNCTIONS DESCRIBED HERE ARE STILL EXPERIMENTAL</emphasis></para> <para>Although both modules make use of libxml2's XML capabilities, the DOM implementation of both modules are not compatible. But still it is possible to exchange nodes from one DOM to the other. The concept of this exchange is pretty similar to the function cloneNode(): The particular node is copied on the lowlevel to the opposite DOM implementation.</para> <para>Since the DOM implementations cannot coexist within one document, one is forced to copy each node that should be used. Because you are always keeping two nodes this may cause quite an impact on a machines memory usage.</para> <para>XML::LibXML provides two functions to export or import GDOME nodes: import_GDOME() and export_GDOME(). Both function have two parameters: the node and a flag for recursive import. The flag works as in cloneNode().</para> <para>The two functions allow to export and import XML::GDOME nodes explicitly, however, XML::LibXML allows also the transparent import of XML::GDOME nodes in functions such as appendChild(), insertAfter() and so on. While native nodes are automaticly adopted in most functions XML::GDOME nodes are always cloned in advance. Thus if the original node is modified after the operation, the node in the XML::LibXML document will not have this information.</para> <variablelist> <varlistentry> <term>import_GDOME</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$libxmlnode = XML::LibXML->import_GDOME( $node, $deep );</funcsynopsisinfo> </funcsynopsis> <para>This clones an XML::GDOME node to a XML::LibXML node explicitly.</para> </listitem> </varlistentry> <varlistentry> <term>export_GDOME</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$gdomenode = XML::LibXML->export_GDOME( $node, $deep );</funcsynopsisinfo> </funcsynopsis> <para>Allows to clone an XML::LibXML node into a XML::GDOME node.</para> </listitem> </varlistentry> </variablelist> </sect1> </chapter> <chapter> <title>Parsing XML Data with XML::LibXML</title> <titleabbrev>XML::LibXML::Parser</titleabbrev> <sect1> <title>Synopsis</title> <programlisting>use XML::LibXML; my $parser = XML::LibXML->new(); my $doc = $parser->parse_string(<<'EOT'); <some-xml/> EOT my $fdoc = $parser->parse_file( $xmlfile ); my $fhdoc = $parser->parse_fh( $xmlstream ); my $fragment = $parser->parse_xml_chunk( $xml_wb_chunk );</programlisting> </sect1> <sect1> <title>Parsing</title> <para>A XML document is read into a datastructure such as a DOM tree by a piece of software, called a parser. XML::LibXML currently provides four diffrent parser interfaces:</para> <itemizedlist> <listitem> <para>A DOM Pull-Parser</para> </listitem> <listitem> <para>A DOM Push-Parser</para> </listitem> <listitem> <para>A SAX Parser</para> </listitem> <listitem> <para>A DOM based SAX Parser.</para> </listitem> </itemizedlist> <sect2> <title>Creating a Parser Instance</title> <para>XML::LibXML provides an OO interface to the libxml2 parser functions. Thus you have to create a parser instance before you can parse any XML data.</para> <variablelist> <varlistentry> <term>new</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser = XML::LibXML->new();</funcsynopsisinfo> </funcsynopsis> <para>There is nothing much to say about the constructor. It simply creates a new parser instance.</para> <para>Although libxml2 uses mainly global flags to alter the behaviour of the parser, each XML::LibXML parser instance has its own flags or callbacks and does not interfere with other instances.</para> </listitem> </varlistentry> </variablelist> </sect2> <sect2> <title>DOM Parser</title> <para>One of the common parser interfaces of XML::LibXML is the DOM parser. This parser reads XML data into a DOM like datastructure, so each tag can get accessed and transformed.</para> <para>XML::LibXML's DOM parser is not only capable to parse XML data, but also (strict) HTML and SGML files. There are three ways to parse documents - as a string, as a Perl filehandle, or as a filename. The return value from each is a XML::LibXML::Document object, which is a DOM object.</para> <para>All of the functions listed below will throw an exception if the document is invalid. To prevent this causing your program exiting, wrap the call in an eval{} block</para> <variablelist> <varlistentry> <term>parse_file</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc = $parser->parse_file( $xmlfilename );</funcsynopsisinfo> </funcsynopsis> <para>This function reads an absolute filename into the memory. It causes XML::LibXML to use libxml2's file parser instead of letting perl reading the file such as with parse_fh(). If you need to parse files directly, this function would be the faster choice, since this function is about 6-8 times faster then parse_fh().</para> </listitem> </varlistentry> <varlistentry> <term>parse_fh</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc = $parser->parse_fh( $io_fh );</funcsynopsisinfo> </funcsynopsis> <para>parse_fh() parses a IOREF or a subclass of IO::Handle.</para> <para>Because the data comes from an open handle, libxml2's parser does not know about the base URI of the document. To set the base URI one should use parse_fh() as follows:</para> <programlisting>my $doc = $parser->parse_fh( $io_fh, $baseuri );</programlisting> </listitem> </varlistentry> <varlistentry> <term>parse_string</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc = $parser->parse_string( $xmlstring);</funcsynopsisinfo> </funcsynopsis> <para>This function is similar to parse_fh(), but it parses a XML document that is available as a single string in memory. Again, you can pass an optional base URI to the function.</para> <programlisting>my $doc = $parser->parse_stirng( $xmlstring, $baseuri );</programlisting> </listitem> </varlistentry> <varlistentry> <term>parse_html_file</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc = $parser->parse_html_file( $htmlfile );</funcsynopsisinfo> </funcsynopsis> <para>Similar to parse_file() but parses HTML (strict) documents.</para> </listitem> </varlistentry> <varlistentry> <term>parse_html_fh</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc = $parser->parse_html_fh( $io_fh );</funcsynopsisinfo> </funcsynopsis> <para>Similar to parse_fh() but parses HTML (strict) streams.</para> </listitem> </varlistentry> <varlistentry> <term>parse_html_string</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc = $parser->parse_html_string( $htmlstring );</funcsynopsisinfo> </funcsynopsis> <para>Similar to parse_file() but parses HTML (strict) strings.</para> </listitem> </varlistentry> <varlistentry> <term>parse_sgml_file</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc = $parser->parse_sgml_file( $sgmlfile );</funcsynopsisinfo> </funcsynopsis> <para>Similar to parse_file() but parses SGML documents.</para> </listitem> </varlistentry> <varlistentry> <term>parse_sgml_fh</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc = $parser->parse_sgml_fh( $io_fh );</funcsynopsisinfo> </funcsynopsis> <para>Similar to parse_file() but parses SGML streams.</para> </listitem> </varlistentry> <varlistentry> <term>parse_sgml_string</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc = $parser->parse_sgml_string( $sgmlstring );</funcsynopsisinfo> </funcsynopsis> <para>Similar to parse_file() but parses SGML strings.</para> </listitem> </varlistentry> </variablelist> <para>Parsing HTML may cause problems, especially if the ampersand ('&') is used. This is a common problem if HTML code is parsed that contains links to CGI-scripts. Such links cause the parser to throw errors. In such cases libxml2 still parses the entire document as there was no error, but the error causes XML::LibXML to stop the parsing process. However, the document is not lost. Such HTML documents should be parsed using the <emphasis>recover</emphasis> flag. By default recovering is deactivated.</para> <para>The functions described above are implemented to parse well formed documents. In some cases a program gets well balanced XML instead of well formed documents (e.g. a XML fragment from a Database). With XML::LibXML it is not required to wrap such fragments in the code, because XML::LibXML is capable even to parse well balanced XML fragments.</para> <variablelist> <varlistentry> <term>parse_balanced_chunk</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$fragment = $parser->parse_balanced_chunk( $wbxmlstring );</funcsynopsisinfo> </funcsynopsis> <para>This function parses a well balanced XML string into a XML::LibXML::DocumentFragment.</para> </listitem> </varlistentry> <varlistentry> <term>parse_xml_chunk</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$fragment = $parser->parse_xml_chunk( $wbxmlstring );</funcsynopsisinfo> </funcsynopsis> <para>This is the old name of parse_balanced_chunk(). Because it may causes confusion with the push parser interface, this function should be used anymore.</para> </listitem> </varlistentry> </variablelist> <para>By default XML::LibXML does not process XInclude tags within a XML Document (see options section below). XML::LibXML allows to post process a document to expand XInclude tags.</para> <variablelist> <varlistentry> <term>process_xincludes</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->process_xincludes( $doc );</funcsynopsisinfo> </funcsynopsis> <para>After a document is parsed into a DOM structure, you may want to expand the documents XInclude tags. This function processes the given document structure and expands all XInclude tags (or throws an error) by using the flags and callbacks of the given parser instance.</para> <para>Note that the resulting Tree contains some extra nodes (of type XML_XINCLUDE_START and XML_XINCLUDE_END) after successfully processing the document. These nodes indicate where data was included into the original tree. if the document is serialized, these extra nodes will not show up.</para> <para>Remember: A Document with processed XIncludes differs from the original document after serialization, because the original XInclude tags will not get restored!</para> <para>If the parser flag "expand_xincludes" is set to 1, you need not to post process the parsed document.</para> </listitem> </varlistentry> <varlistentry> <term>processXIncludes</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->processXIncludes( $doc );</funcsynopsisinfo> </funcsynopsis> <para>This is an alias to process_xincludes, but through a JAVA like function name.</para> </listitem> </varlistentry> </variablelist> </sect2> <sect2> <title>Push Parser</title> <para>XML::LibXML provides a push parser interface. Rather than pulling the data from a given source the push parser waits for the data to be pushed into it.</para> <para>This allows one to parse large documents without waiting for the parser to finish. The interface is especially useful if a program needs to preprocess the incoming pieces of XML (e.g. to detect document boundaries).</para> <para>While XML::LibXML parse_*() functions force the data to be a wellformed XML, the push parser will take any arbitrary string that contains some XML data. The only requirement is that all the pushed strings are together a well formed document. With the push parser interface a programm can interrupt the parsing process as required, where the parse_*() functions give not enough flexibility.</para> <para>Different to the pull parser implemented in parse_fh() or parse_file(), the push parser is not able to find out about the documents end itself. Thus the calling program needs to indicate explicitly when the parsing is done.</para> <para>In XML::LibXML this is done by a single function:</para> <variablelist> <varlistentry> <term>parse_chunk</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->parse_chunk($string, $terminate);</funcsynopsisinfo> </funcsynopsis> <para>parse_chunk() tries to parse a given chunk of data, which isn't nessecarily well balanced data. The function takes two parameters: The chunk of data as a string and optional a termination flag. If the termination flag is set to a true value (e.g. 1), the parsing will be stopped and the resulting document will be returned as the following exable describes:</para> <programlisting>my $parser = XML::LibXML->new; for my $string ( "<", "foo", ' bar="hello worls"', "/>") { $parser->parse_chunk( $string ); } my $doc = $parser->parse_chunk("", 1); # terminate the parsing</programlisting> </listitem> </varlistentry> </variablelist> <para>Internally XML::LibXML provides three functions that control the push parser process:</para> <variablelist> <varlistentry> <term>start_push</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->start_push();</funcsynopsisinfo> </funcsynopsis> <para>Initializes the push parser.</para> </listitem> </varlistentry> <varlistentry> <term>push</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->push(@data);</funcsynopsisinfo> </funcsynopsis> <para>This function pushes the data stored inside the array to libxml2's parser. Each entry in @data must be a normal scalar!</para> </listitem> </varlistentry> <varlistentry> <term>finish_push</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc = $parser->finish_push( $recover );</funcsynopsisinfo> </funcsynopsis> <para>This function returns the result of the parsing process. If this function is called without a parameter it will complain about non wellformed documents. If $restore is 1, the push parser can be used to restore broken or non well formed (XML) documents as the following example shows:</para> <programlisting>eval { $parser->push( "<foo>", "bar" ); $doc = $parser->finish_push(); # will report broken XML }; if ( $@ ) { # ... }</programlisting> <para>This can be annoying if the closing tag is missed by accident. The following code will restore the document:</para> <programlisting>eval { $parser->push( "<foo>", "bar" ); $doc = $parser->finish_push(1); # will return the data parsed # unless an error happened }; print $doc->toString(); # returns "<foo>bar</foo>"</programlisting> <para>Of course finish_push() will return nothing if there was no data pushed to the parser before.</para> </listitem> </varlistentry> </variablelist> </sect2> <sect2> <title>DOM based SAX Parser</title> <para>XML::LibXML provides a DOM based SAX parser. The SAX parser is defined in XML::LibXML::SAX::Parser. As it is not a stream based parser, it parses documents into a DOM and traverses the DOM tree instead.</para> <para>The API of this parser is exactly the same as any other Perl SAX2 parser. See XML::SAX::Intro for details.</para> <para>Aside from the regular parsing methods, you can access the DOM tree traverser directly, using the generate() method:</para> <programlisting>my $doc = build_yourself_a_document(); my $saxparser = $XML::LibXML::SAX::Parser->new( ... ); $parser->generate( $doc );</programlisting> <para>This is useful for serializing DOM trees, for example that you might have done prior processing on, or that you have as a result of XSLT processing.</para> <para><emphasis>WARNING</emphasis></para> <para>This is NOT a streaming SAX parser. As I said above, this parser reads the entire document into a DOM and serialises it. Some people couldn't read that in the paragraph above so I've added this warning.</para> <para>If you want a streaming SAX parser look at the XML::LibXML::SAX man page</para> </sect2> </sect1> <sect1> <title>Serialization</title> <para>XML::LibXML provides some functions to serialize nodes and documents. The serialization functions are described on the XML::LibXML::Node manpage or the XML::LibXML::Document manpage. XML::LibXML checks three global flags that alter the serialization process:</para> <itemizedlist> <listitem> <para>skipXMLDeclaration</para> </listitem> <listitem> <para>skipDTD</para> </listitem> <listitem> <para>setTagCompression</para> </listitem> </itemizedlist> <para>of that three functions only setTagCompression is available for all serialization functions.</para> <para>Because XML::LibXML does these flags not itself, one has to define them locally as the following example shows:</para> <programlisting>local $XML::LibXML::skipXMLDeclaration = 1; local $XML::LibXML::skipDTD = 1; local $XML::LibXML::setTagCompression = 1;</programlisting> <para>If skipXMLDeclaration is defined and not '0', the XML declaration is omitted during serialization.</para> <para>If skipDTD is defined and not '0', an existing DTD would not be serialized with the document.</para> <para>If setTagCompression is defined and not '0' empty tags are displayed as open and closing tags ranther than the shortcut. For example the empty tag <emphasis>foo</emphasis> will be rendered as <emphasis><foo></foo></emphasis> rather than <emphasis><foo/></emphasis>.</para> </sect1> <sect1> <title>Parser Options</title> <para>LibXML options are global (unfortunately this is a limitation of the underlying implementation, not this interface). They can either be set using $parser->option(...), or XML::LibXML->option(...), both are treated in the same manner. Note that even two parser processes will share some of the same options, so be careful out there!</para> <para>Every option returns the previous value, and can be called without parameters to get the current value.</para> <variablelist> <varlistentry> <term>validation</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->validation(1);</funcsynopsisinfo> </funcsynopsis> <para>Turn validation on (or off). Defaults to off.</para> </listitem> </varlistentry> <varlistentry> <term>recover</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->recover(1);</funcsynopsisinfo> </funcsynopsis> <para>Turn the parsers recover mode on (or off). Defaults to off.</para> <para>This allows one to parse broken XML data into memory. This switch will only work with XML data rather than HTML data. Also the validation will be switched off automaticly.</para> <para>The recover mode helps to recover documents that are almost wellformed very efficiently. That is for example a document that forgets to close the document tag (or any other tag inside the document). The recover mode of XML::LibXML has problems restoring documents that are more like well ballanced chunks.</para> <para>XML::LibXML will only parse until the first fatal error occours.</para> </listitem> </varlistentry> <varlistentry> <term>expand_entities</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->expand_entities(0);</funcsynopsisinfo> </funcsynopsis> <para>Turn entity expansion on or off, enabled by default. If entity expansion is off, any external parsed entities in the document are left as entities. Probably not very useful for most purposes.</para> </listitem> </varlistentry> <varlistentry> <term>keep_blanks</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->keep_blanks(0);</funcsynopsisinfo> </funcsynopsis> <para>Allows you to turn off XML::LibXML's default behaviour of maintaining whitespace in the document.</para> </listitem> </varlistentry> <varlistentry> <term>pedantic_parser</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->pedantic_parser(1)</funcsynopsisinfo> </funcsynopsis> <para>You can make XML::LibXML more pedantic if you want to.</para> </listitem> </varlistentry> <varlistentry> <term>line_numbers</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->line_numbers(1)</funcsynopsisinfo> </funcsynopsis> <para>If this option is activated XML::LibXML will store the line number of a node. This gives more information where a validation error occoured. It could be also used to find out about the position of a node after parsing (see also XML::LibXML::Node::line_number())</para> <para>By default line numbering is switched off (0).</para> </listitem> </varlistentry> <varlistentry> <term>load_ext_dtd</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->load_ext_dtd(1);</funcsynopsisinfo> </funcsynopsis> <para>Load external DTD subsets while parsing.</para> </listitem> </varlistentry> <varlistentry> <term>complete_attributes</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->complete_attributes(1);</funcsynopsisinfo> </funcsynopsis> <para>Complete the elements attributes lists with the ones defaulted from the DTDs. By default, this option is enabled.</para> </listitem> </varlistentry> <varlistentry> <term>expand_xinclude</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->expand_xinclude(1);</funcsynopsisinfo> </funcsynopsis> <para>Expands XIinclude tags immediately while parsing the document. This flag assures that the parser callbacks are used while parsing the included document.</para> </listitem> </varlistentry> <varlistentry> <term>load_catalog</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->load_catalog( $catalog_file );</funcsynopsisinfo> </funcsynopsis> <para>Will use $catalog_file as a catalog during all parsing processes. Using a catalog will significantly speed up parsing processes if many external resources are loaded into the parsed documents (such as DTDs or XIncludes).</para> <para>Note that catalogs will not be available if an external entity handler was specified. At the current state it is not possible to make use of both types of resolving systems at the same time.</para> </listitem> </varlistentry> <varlistentry> <term>base_uri</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->base_uri( $your_base_uri );</funcsynopsisinfo> </funcsynopsis> <para>In case of parsing strings or file handles, XML::LibXML doesn't know about the base uri of the document. To make relative references such as XIncludes work, one has to set a separate base URI, that is then used for the parsed documents.</para> </listitem> </varlistentry> <varlistentry> <term>gdome_dom</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->gdome_dom(1);</funcsynopsisinfo> </funcsynopsis> <para>THIS FLAG IS EXPERIMENTAL!</para> <para>Although quite powerful XML:LibXML's DOM implementation is limited if one needs or wants full DOM level 2 or level 3 support. XML::GDOME is based on libxml2 as well but provides a rather complete DOM implementation by wrapping libgdome. This allows you to make use of XML::LibXML's full parser options and XML::GDOME's DOM implementation at the same time.</para> <para>To make use of this function, one has to install libgdome and configure XML::LibXML to use this library. For this you need to rebuild XML::LibXML!</para> </listitem> </varlistentry> <varlistentry> <term>clean_namespaces</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->clean_namespaces( 1 );</funcsynopsisinfo> </funcsynopsis> <para>libxml2 2.6.0 and later allows to strip redundant namespace declarations from the DOM tree. To do this, one has to set clean_namespaces() to 1 (TRUE). By default no namespace cleanup is done.</para> </listitem> </varlistentry> </variablelist> <sect2> <title>Input Callbacks</title> <para>If libxml2 has to load external documents during parsing, this may cause strange results, if the location is not a HTTP, FTP or relative location. To get around this limitation, one may add its own input handler, to open, read and close particular locations or URI classes.</para> <para>The input callbacks are used whenever LibXML has to get something other than external parsed entities from somewhere. The input callbacks in LibXML are stacked on top of the original input callbacks within the libxml library. This means that if you decide not to use your own callbacks (see match()), then you can revert to the default way of handling input. This allows, for example, to only handle certain URI schemes.</para> <para>Callbacks are only used on files, but not on strings or filehandles. This is because LibXML requires the match event to find out about which callback set is shall be used for the current input stream. LibXML can decide this only before the stream is open. For LibXML strings and filehandles are already opened streams.</para> <para>The following callbacks are defined:</para> <variablelist> <varlistentry> <term>match_callback</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->match_callback($subref);</funcsynopsisinfo> </funcsynopsis> <para>If you want to handle the URI, simply return a true value from this callback.</para> </listitem> </varlistentry> <varlistentry> <term>open_callback</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->open_callback($subref);</funcsynopsisinfo> </funcsynopsis> <para>Open something and return it to handle that resource.</para> </listitem> </varlistentry> <varlistentry> <term>read_callback</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->read_callback($subref);</funcsynopsisinfo> </funcsynopsis> <para>Read a certain number of bytes from the resource. This callback is called even if the entire Document has already read. This callback has to return a string which will be parsed by the libxml2 parser.</para> </listitem> </varlistentry> <varlistentry> <term>close_callback</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parser->close_callback($subref);</funcsynopsisinfo> </funcsynopsis> <para>Close the handle associated with the resource.</para> </listitem> </varlistentry> </variablelist> <para>It is important that one must not create a new parser instance and parse some XML data from within any callback. This is forbidden, because the new parser will override the existing callbacks and will leave the calling parser in an undefined state. Most likely memory violations will follow and break the running parsing process without returning control to the perl layer. </para> <para>The following example explains the concept a bit. It is a purely fictitious example that uses a MyScheme::Handler object that responds to methods similar to an IO::Handle.</para> <programlisting>$parser->match_callback(\&match_uri); $parser->open_callback(\&open_uri); $parser->read_callback(\&read_uri); $parser->close_callback(\&close_uri); sub match_uri { my $uri = shift; return $uri =~ /^myscheme:/; } sub open_uri { my $uri = shift; return MyScheme::Handler->new($uri); } sub read_uri { my $handler = shift; my $length = shift; my $buffer; read($handler, $buffer, $length); return $buffer; } sub close_uri { my $handler = shift; close($handler); }</programlisting> <para>A more realistic example can be found in the "example" directory.</para> <para>Since the parser requires all callbacks defined it is also possible to set all callbacks with a single call of callbacks(). This would implify the example code to:</para> <programlisting>$parser->callbacks( \&match_uri, \&open_uri, \&read_uri, \&close_uri);</programlisting> <para>All functions that are used to set the callbacks, can also be used to retrieve the callbacks from the parser.</para> <para>Optionaly it is possible to apply global callback on the XML::LibXML class level. This allows multiple parses to share the same callbacks. To set these global callbacks one can use the callback access functions directly on the class.</para> <programlisting>XML::LibXML->callbacks( \&match_uri, \&open_uri, \&read_uri, \&close_uri);</programlisting> <para>The previous code snippet will set the callbacks from the first example as global callbacks.</para> </sect2> </sect1> <sect1> <title>Error Reporting</title> <para>XML::LibXML throws exceptions during parsing, validation or XPath processing (and some other occations). These errors can be caught by using <emphasis>eval</emphasis> blocks. The error then will be stored in <emphasis>$@</emphasis>. Alternatively one can use the get_last_error() function of XML::LibXML. It will return the same string that is stored in <emphasis>$@</emphasis>. Using get_last_error() makes it still nessecary to eval the statement, since these function groups will die() on errors.</para> <para>Note, that the use of get_last_error() still requires eval blocks. XML::LibXML throws errors as they occour and does not wait if a user test for them. This is a very common misunderstanding in the use of XML::LibXML. If the eval is ommited, XML::LibXML will allways halt your script by "croaking" (see Carp man page for details).</para> <para>Also note that an increasing number throws errors if bad data is passed. If you cannot asure valid data passed to XML::LibXML you should eval these functions.</para> <para>get_last_error() can be called either by the class itself or by a parser instance:</para> <programlisting>$errstring = XML::LibXML->get_last_error(); $errstring = $parser->get_last_error();</programlisting> <para>However, XML::LibXML exceptions are global. That means if get_last_error() is called on an parser instance, the last <emphasis>global</emphasis> error will be returned. This is not necessarily the error caused by the parser instance itself.</para> </sect1> </chapter> <chapter> <title>XML::LibXML direct SAX parser</title> <titleabbrev>XML::LibXML::SAX</titleabbrev> <sect1> <title>Description</title> <para>XML::LibXML provides an interface to libxml2 direct SAX interface. Through this interface it is possible to generate SAX events directly while parsing a document. While using the SAX parser XML::LibXML will not create a DOM Document tree.</para> <para>Such an interface is useful if very large XML documents have to be processed and no DOM functions are required. By using this interface it is possible to read data stored within a XML document directly into the application datastructures without loading the document into memory.</para> <para>The SAX interface of XML::LibXML is based on the famous XML::SAX interface. It uses the generic interface as provided by XML::SAX::Base.</para> <para>Additionally to the generic functions, which are only able to process entire documents, XML::LibXML::SAX provides <emphasis>parse_chunk()</emphasis>. This method generates SAX events from well ballanced data such as is often provided by databases.</para> <para><emphasis>NOTE:</emphasis> At the moment XML::LibXML provides only an incomplete interface to libxml2's native SAX implementaion. The current implementation is not tested in production environment. It may causes significant memory problems or shows wrong behaviour. If you run into specific problems using this part of XML::LibXML, let me know.</para> </sect1> </chapter> <chapter> <title>Building DOM trees from SAX events.</title> <titleabbrev>XML::LibXML::SAX::Builder</titleabbrev> <sect1> <title>Synopsis</title> <programlisting>my $builder = XML::LibXML::SAX::Builder->new(); my $gen = XML::Generator::DBI->new(Handler => $builder, dbh => $dbh); $gen->execute("SELECT * FROM Users"); my $doc = $builder->result();</programlisting> </sect1> <sect1> <title>Description</title> <para>This is a SAX handler that generates a DOM tree from SAX events. Usage is as above. Input is accepted from any SAX1 or SAX2 event generator.</para> <para>Building DOM trees from SAX events is quite easy with XML::LibXML::SAX::Builder. The class is designed as a SAX2 final handler not as a filter!</para> <para>Since SAX is strictly stream oriented, you should not expect anything to return from a generator. Instead you have to ask the builder instance directly to get the document built. XML::LibXML::SAX::Builder's result() function holds the document generated from the last SAX stream.</para> </sect1> </chapter> <chapter> <title>XML::LibXML DOM Implementation</title> <titleabbrev>XML::LibXML::DOM</titleabbrev> <sect1> <title>Description</title> <para>XML::LibXML provides an lightwight interface to <emphasis>modify</emphasis> a node of the document tree generated by the XML::LibXML parser. This interface follows as far as possible the DOM Level 3 specification. Additionally to the specified functions the XML::LibXML supports some functions that are more handy to use in the perl environment.</para> <para>One also has to remember, that XML::LibXML is an interface to libxml2 nodes which actually reside on the C-Level of XML::LibXML. This means each node is a reference to a structure different than a perl hash or array. The only way to access these structure's values is through the DOM interface provided by XML::LibXML. This also means, that one <emphasis>can't</emphasis> simply inherit a XML::LibXML node and add new member variables as they were hash keys.</para> <para>The DOM interface of XML::LibXML does not intend to implement a full DOM interface as it is done by XML::GDOME and used for full featured application. Moreover, it offers an simple way to build or modify documents that are created by XML::LibXML's parser.</para> <para>Another target of the XML::LibXML interface is to make the interfaces of libxml2 available to the perl community. This includes also some workarounds to some features where libxml2 assumes more control over the C-Level that most perl users don't have.</para> <para>One of the most important parts of the XML::LibXML DOM interface is, that the interfaces try do follow the DOM Level 3 specification rather strictly. This means the interface functions are named as the DOM specification says and not what widespread Java interfaces claim to be standard. Although there are several functions that have only a singular interface that conforms to the DOM spec XML::LibXML provides an additional Java style alias interface.</para> <para>Also there are some function interfaces left over from early stages of XML::LibXML for compatibility reasons. These interfaces are for compatibility reasons <emphasis>only</emphasis>. They might disappear in one of the future versions of XML::LibXML, so a user is requested to switch over to the official functions.</para> <para>More recent versions of perl (e.g. 5.6.1 or higher) support special flags to disinguish between UTF8 and so called binary data. XML::LibXML provides for these versions functionality to make efficient use of these flags: If a document has set an encoding other than UTF8 all strings that are not already in UTF8 are implicitly encoded from the document encoding to UTF8. On output these strings are commonly returned as UTF8 unless a user does request explicitly the original (aka. document) encoding.</para> <para>Older version of perl (such as 5.00503 or less) do not support these flags. If XML::LibXML is build for these versions, all strings have to get encoded to UTF8 manualy before they are passed to any DOM functions.</para> <para><emphasis>NOTE:</emphasis> XML::LibXML's magic encoding may not work on all plattforms. Some platforms are known to have a broken iconv(), which is partly used by libxml2. To test if your platform works correctly with your language encoding, build a simple document in the particular encoding and try to parse it with XML::LibXML. If your document gets parsed with out causing any segmentation faults, bus errors or whatever your OS throws. An example for such a test can be found in test 19encoding.t of the distribution.</para> <para><emphasis>Namespaces and XML::LibXML's DOM implementation</emphasis></para> <para>XML::LibXML's DOM implementation follows the DOM implementation of libxml2. This is important to know if namespaces are used. Namespaces cannot be declared on an document node. This is basicly because XPath doesn't know about document nodes. Therefore namespaces have to be declared on element nodes. This can happen explicitly by using XML::LibXML:Element's setNamespace() function or more or less implicitly by using XML::LibXML::Document's createElementNS() or createAttributeNS() function. If the a namespace is not declared on the documentElement, the namespace will be localy declared for the newly created node. In case of Attributes this may look a bit confusing, since these nodes cannot have namespace declarations itself. In this case the namespace in internally applied to the attribute and later declared on the node the attribute is appended to.</para> <para>The following example may explain this a bit:</para> <programlisting> my $doc = XML::LibXML->createDocument; my $root = $doc->createElementNS( "", "foo" ); $doc->setDocumentElement( $root ); my $attr = $doc->createAttributeNS( "bar", "bar:foo", "test" ); $root->setAttributeNodeNS( $attr ); </programlisting> <para>This piece of code will result in the following document:</para> <programlisting> <?xml version="1.0"?> <foo xmlns:bar="bar" bar:foo="test"/></programlisting> <para>Note that the namespace is declared on the document element while the setAttributeNodeNS() call.</para> <para>Here it is important to repeat the specification: While working with namespaces you should use the namespace aware functions instead of the simplified versions. For example you should <emphasis>never</emphasis> use setAttributeNode() but setAttributeNodeNS().</para> </sect1> </chapter> <chapter> <title>XML::LibXML DOM Document Class</title> <titleabbrev>XML::LibXML::Document</titleabbrev> <para>The Document Class is in most cases the result of a parsing process. But sometimes it is necessary to create a Document from scratch. The DOM Document Class provides functions that conform to the DOM Core naming style.</para> <para>It inherits all functions from <function>XML::LibXML::Node</function> as specified in the DOM specification. This enables access to the nodes besides the root element on document level - a <function>DTD</function> for example. The support for these nodes is limited at the moment.</para> <para>While generaly nodes are bound to a document in the DOM concept it is suggested that one should always create a node not bound to any document. There is no need of really including the node to the document, but once the node is bound to a document, it is quite safe that all strings have the correct encoding. If an unbound textnode with an iso encoded string is created (e.g. with $CLASS->new()), the <function>toString</function> function may not return the expected result.</para> <para>All this seems like a limitation as long as UTF8 encoding is assured. If iso encoded strings come into play it is much safer to use the node creation functions of <emphasis>XML::LibXML::Document</emphasis>.</para> <variablelist> <varlistentry> <term>new</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$dom = XML::LibXML::Document->new( $version, $encoding );</funcsynopsisinfo> </funcsynopsis> <para>alias for createDocument()</para> </listitem> </varlistentry> <varlistentry> <term>createDocument</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$dom = XML::LibXML::Document->createDocument( $version, $encoding );</funcsynopsisinfo> </funcsynopsis> <para>The constructor for the document class. As Parameter it takes the version string and (optionally) the encoding string. Simply calling <emphasis>createDocument</emphasis>() will create the document:</para> <programlisting><?xml version="your version" encoding="your encoding"?></programlisting> <para>Both parameter are optional. The default value for <emphasis>$version</emphasis> is <function>1.0</function>, of course. If the <emphasis>$encoding</emphasis> parameter is not set, the encoding will be left unset, which means UTF8 is implied.</para> <para>The call of <emphasis>createDocument</emphasis>() without any parameter will result the following code:</para> <programlisting><?xml version="1.0"?> </programlisting> <para>Alternatively one can call this constructor directly from the XML::LibXML class level, to avoid some typing. This will not have any effect on the class instance, which is always XML::LibXML::Document.</para> <programlisting> my $document = XML::LibXML->createDocument( "1.0", "UTF8" );</programlisting> <para>is therefore a shortcut for</para> <programlisting>my $document = XML::LibXML::Document->createDocument( "1.0", "UTF8" );</programlisting> </listitem> </varlistentry> <varlistentry> <term>encoding</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$strEncoding = $doc->encoding();</funcsynopsisinfo> </funcsynopsis> <para>returns the encoding string of the document.</para> <programlisting>my $doc = XML::LibXML->createDocument( "1.0", "ISO-8859-15" ); print $doc->encoding; # prints ISO-8859-15</programlisting> <para>Optionally this function can be accessed by <emphasis>actualEncoding</emphasis> or <emphasis>getEncoding</emphasis>.</para> </listitem> </varlistentry> <varlistentry> <term>setEncoding</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc->setEncoding($new_encoding);</funcsynopsisinfo> </funcsynopsis> <para>From time to time it is useful to change the effective encoding of a document. This method provides the interface to manipulate the encoding of a document.</para> <para>Note that this function has to be used very carefully, since you can't simply convert one encoding in any other, since some (or even all) characters may not exist in the new encoding. XML::LibXML will not test if the operation is allowed or possible for the given document. The only switching assured to work is to UTF8.</para> </listitem> </varlistentry> <varlistentry> <term>version</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$strVersion = $doc->version();</funcsynopsisinfo> </funcsynopsis> <para>returns the version string of the document</para> <para><emphasis>getVersion()</emphasis> is an alternative form of this function.</para> </listitem> </varlistentry> <varlistentry> <term>standalone</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc->standalone</funcsynopsisinfo> </funcsynopsis> <para>This function returns the Numerical value of a documents XML declarations standalone attribute. It returns <emphasis>1</emphasis> if standalone="yes" was found, <emphasis>0</emphasis> if standalone="no" was found and <emphasis>-1</emphasis> if standalone was not specified (default on creation).</para> </listitem> </varlistentry> <varlistentry> <term>setStandalone</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc->setStandalone($numvalue);</funcsynopsisinfo> </funcsynopsis> <para>Through this method it is possible to alter the value of a documents standalone attribute. Set it to <emphasis>1</emphasis> to set standalone="yes", to <emphasis>0</emphasis> to set standalone="no" or set it to <emphasis>-1</emphasis> to remove the standalone attribute from the XML declaration.</para> </listitem> </varlistentry> <varlistentry> <term>compression</term> <listitem> <funcsynopsis> <funcsynopsisinfo>my $compression = $doc->compression;</funcsynopsisinfo> </funcsynopsis> <para>libxml2 allows reading of documents directly from gziped files. In this case the compression variable is set to the compression level of that file (0-8). If XML::LibXML parsed a different source or the file wasn't compressed, the returned value will be <emphasis>-1</emphasis>.</para> </listitem> </varlistentry> <varlistentry> <term>setCompression</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc->setCompression($ziplevel);</funcsynopsisinfo> </funcsynopsis> <para>If one intends to write the document directly to a file, it is possible to set the compression level for a given document. This level can be in the range from 0 to 8. If XML::LibXML should not try to compress use <emphasis>-1</emphasis> (default).</para> <para>Note that this feature will <emphasis>only</emphasis> work if libxml2 is compiled with zlib support and toFile() is used for output.</para> </listitem> </varlistentry> <varlistentry> <term>toString</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$docstring = $dom->toString($format);</funcsynopsisinfo> </funcsynopsis> <para><emphasis>toString</emphasis> is a deparsing function, so the DOM Tree can be translated into a string, ready for output.</para> <para>The optional <emphasis>$format</emphasis> parameter sets the indenting of the output. This parameter is expected to be an <function>integer</function> value, that specifies that indentation should be used. The format parameter can have three different values if it is used:</para> <para>If $format is 0, than the document is dumped as it was originally parsed</para> <para>If $format is 1, libxml2 will add ignorable whitespaces, so the nodes content is easier to read. Existing text nodes will not be altered</para> <para>If $format is 2 (or higher), libxml2 will act as $format == 1 but it add a leading and a trailing linebreak to each text node.</para> <para>libxml2 uses a hardcoded indentation of 2 space characters per indentation level. This value can not be altered on runtime.</para> <para><emphasis>NOTE</emphasis>: XML::LibXML::Document::toString returns the data in the document encoding rather than UTF8! If you want UTF8 ecoded XML, you have to change the conding by using <function>setEncoding()</function></para> </listitem> </varlistentry> <varlistentry> <term>toStringC14N</term> <listitem> <para><funcsynopsis><funcsynopsisinfo>$c14nstr = $doc->toStringC14N($comment_flag,$xpath); </funcsynopsisinfo></funcsynopsis>A variation to toString, that returns the canonized form of the given document.</para> </listitem> </varlistentry> <varlistentry> <term>serialize</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$str = $doc->serialze($format); </funcsynopsisinfo> </funcsynopsis> <para>Alternative form of toString(). This function name added to be more conformant with libxml2's examples.</para> </listitem> </varlistentry> <varlistentry> <term>serialize_c14n</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$c14nstr = $doc->serialize_c14n($comment_flag,$xpath); </funcsynopsisinfo> </funcsynopsis> <para>Alternative form of toStringC14N().</para> </listitem> </varlistentry> <varlistentry> <term>toFile</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$state = $doc->toFile($filename, $format);</funcsynopsisinfo> </funcsynopsis> <para>This function is similar to toString(), but it writes the document directly into a filesystem. This function is very useful, if one needs to store large documents.</para> <para>The format parameter has the same behaviour as in toString().</para> </listitem> </varlistentry> <varlistentry> <term>toFH</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$state = $doc->toFH($fh, $format);</funcsynopsisinfo> </funcsynopsis> <para>This function is similar to toString(), but it writes the document directly to a filehandler or a stream.</para> <para>The format parameter has the same behaviour as in toString().</para> </listitem> </varlistentry> <varlistentry> <term>toStringHTML</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$str = $document->toStringHTML();</funcsynopsisinfo> </funcsynopsis> <para><emphasis>toStringHTML</emphasis> deparses the tree to a string as HTML. With this method indenting is automatic and managed by libxml2 internally.</para> </listitem> </varlistentry> <varlistentry> <term>serialize_html</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$str = $document->serialize_html();</funcsynopsisinfo> </funcsynopsis> <para>Alternative form of toStringHTML().</para> </listitem> </varlistentry> <varlistentry> <term>is_valid</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$bool = $dom->is_valid();</funcsynopsisinfo> </funcsynopsis> <para>Returns either TRUE or FALSE depending on whether the DOM Tree is a valid Document or not.</para> <para>You may also pass in a XML::LibXML::Dtd object, to validate against an external DTD:</para> <programlisting> if (!$dom->is_valid($dtd)) { warn("document is not valid!"); }</programlisting> </listitem> </varlistentry> <varlistentry> <term>validate</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$dom->validate();</funcsynopsisinfo> </funcsynopsis> <para>This is an exception throwing equivalent of is_valid. If the document is not valid it will throw an exception containing the error. This allows you much better error reporting than simply is_valid or not.</para> <para>Again, you may pass in a DTD object</para> </listitem> </varlistentry> <varlistentry> <term>documentElement</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$root = $dom->documentElement();</funcsynopsisinfo> </funcsynopsis> <para>Returns the root element of the Document. A document can have just one root element to contain the documents data.</para> <para>Optionaly one can use <emphasis>getDocumentElement</emphasis>.</para> </listitem> </varlistentry> <varlistentry> <term>setDocumentElement</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$dom->setDocumentElement( $root );</funcsynopsisinfo> </funcsynopsis> <para>This function enables you to set the root element for a document. The function supports the import of a node from a different document tree.</para> </listitem> </varlistentry> <varlistentry> <term>createElement</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$element = $dom->createElement( $nodename );</funcsynopsisinfo> </funcsynopsis> <para>This function creates a new Element Node bound to the DOM with the name <function>$nodename</function>.</para> </listitem> </varlistentry> <varlistentry> <term>createElementNS</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$element = $dom->createElementNS( $namespaceURI, $qname );</funcsynopsisinfo> </funcsynopsis> <para>This function creates a new Element Node bound to the DOM with the name <function>$nodename</function> and placed in the given namespace.</para> </listitem> </varlistentry> <varlistentry> <term>createTextNode</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$text = $dom->createTextNode( $content_text );</funcsynopsisinfo> </funcsynopsis> <para>As an equivalent of <emphasis>createElement</emphasis>, but it creates a <emphasis>Text Node</emphasis> bound to the DOM.</para> </listitem> </varlistentry> <varlistentry> <term>createComment</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$comment = $dom->createComment( $comment_text );</funcsynopsisinfo> </funcsynopsis> <para>As an equivalent of <emphasis>createElement</emphasis>, but it creates a <emphasis>Comment Node</emphasis> bound to the DOM.</para> </listitem> </varlistentry> <varlistentry> <term>createAttribute</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$attrnode = $doc->createAttribute($name [,$value]);</funcsynopsisinfo> </funcsynopsis> <para>Creates a new Attribute node.</para> </listitem> </varlistentry> <varlistentry> <term>createAttributeNS</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$attrnode = $doc->createAttributeNS( namespaceURI, $name [,$value] );</funcsynopsisinfo> </funcsynopsis> <para>Creates an Attribute bound to a namespace.</para> </listitem> </varlistentry> <varlistentry> <term>createDocumentFragment</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$fragment = $doc->createDocumentFragment()</funcsynopsisinfo> </funcsynopsis> <para>This function creates a DocumentFragment.</para> </listitem> </varlistentry> <varlistentry> <term>createCDATASection</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$cdata = $dom->create( $cdata_content );</funcsynopsisinfo> </funcsynopsis> <para>Similar to createTextNode and createComment, this function creates a CDataSection bound to the current DOM.</para> </listitem> </varlistentry> <varlistentry> <term>createProcessingInstruction</term> <listitem> <funcsynopsis> <funcsynopsisinfo>my $pi = $doc->createProcessingInstruction( $target, $data );</funcsynopsisinfo> </funcsynopsis> <para>create a processing instruction node.</para> <para>Since this method is quite long one may use its short form <emphasis>createPI()</emphasis>.</para> </listitem> </varlistentry> <varlistentry> <term>createEntityReference</term> <listitem> <funcsynopsis> <funcsynopsisinfo>my $entref = $doc->createEntityReference($refname);</funcsynopsisinfo> </funcsynopsis> <para>If a document has a DTD specified, one can create entity references by using this function. If one wants to add a entity reference to the document, this reference has to be created by this function.</para> <para>An entity reference is unique to a document and cannot be passed to other documents as other nodes can be passed.</para> <para><emphasis>NOTE:</emphasis> A text content containing something that looks like an entity reference, will not be expanded to a real entity reference unless it is a predefined entity</para> <programlisting> my $string = "&foo;"; $some_element->appendText( $string ); print $some_element->textContent; # prints "&amp;foo;"</programlisting> </listitem> </varlistentry> <varlistentry> <term>createInternalSubset</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$dtd = $document->createInternalSubset( $rootnode, $public, $system);</funcsynopsisinfo> </funcsynopsis> <para>This function creates and adds an internal subset to the given document. Because the function automaticly adds the DTD to the document there is no need to add the created node explicitly to the document.</para> <programlisting> my $document = XML::LibXML::Document->new(); my $dtd = $document->createInternalSubset( "foo", undef, "foo.dtd" );</programlisting> <para>will result in the following XML document:</para> <programlisting><?xml version="1.0"?> <!DOCTYPE foo SYSTEM "foo.dtd"> </programlisting> <para>By setting the public parameter it is possible to set PUBLIC dtds to a given document. So</para> <programlisting>my $document = XML::LibXML::Document->new(); my $dtd = $document->createInternalSubset( "foo", "-//FOO//DTD FOO 0.1//EN", undef ); </programlisting> <para>will cause the following declaration to be created on the document:</para> <programlisting><?xml version="1.0"?> <!DOCTYPE foo PUBLIC "-//FOO//DTD FOO 0.1//EN"></programlisting> </listitem> </varlistentry> <varlistentry> <term>createExternalSubset</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$dtd = $document->createExternalSubset( $rootnode, $public, $system);</funcsynopsisinfo> </funcsynopsis> <para>This function is similar to <function>createInternalSubset()</function> but this DTD is considered to be external and is therefore not added to the document itself. Nevertheless it can be used for validation purposes.</para> </listitem> </varlistentry> <varlistentry> <term>importNode</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$document->importNode( $node );</funcsynopsisinfo> </funcsynopsis> <para>If a node is not part of a document, it can be imported to another document. As specified in DOM Level 2 Specification the Node will not be altered or removed from its original document (<function>$node->cloneNode(1)</function> will get called implicitly).</para> <para><emphasis>NOTE:</emphasis> Don't try to use importNode() to import subtrees that contain an entity reference - even if the entity reference is the root node of the subtree. This will cause serious problems to your program. This is a limitation of libxml2 and not of XML::LibXML itself.</para> </listitem> </varlistentry> <varlistentry> <term>adoptNode</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$document->adoptNode( $node );</funcsynopsisinfo> </funcsynopsis> <para>If a node is not part of a document, it can be imported to another document. As specified in DOM Level 3 Specification the Node will not be altered but it will removed from its original document.</para> <para>After a document adopted a node, the node, its attributes and all its descendants belong to the new document. Because the node does not belong to the old document, it will be unlinked from its old location first.</para> <para><emphasis>NOTE:</emphasis> Don't try to adoptNode() to import subtrees that contain entity references - even if the entity reference is the root node of the subtree. This will cause serious problems to your program. This is a limitation of libxml2 and not of XML::LibXML itself.</para> </listitem> </varlistentry> <varlistentry> <term>externalSubset</term> <listitem> <funcsynopsis> <funcsynopsisinfo>my $dtd = $doc->externalSubset;</funcsynopsisinfo> </funcsynopsis> <para>If a document has an external subset defined it will be returned by this function.</para> <para><emphasis>NOTE</emphasis> Dtd nodes are no ordinary nodes in libxml2. The support for these nodes in XML::LibXML is still limited. In particular one may not want use common node function on doctype declaration nodes!</para> </listitem> </varlistentry> <varlistentry> <term>internalSubset</term> <listitem> <funcsynopsis> <funcsynopsisinfo>my $dtd = $doc->internalSubset;</funcsynopsisinfo> </funcsynopsis> <para>If a document has an internal subset defined it will be returned by this function.</para> <para><emphasis>NOTE</emphasis> Dtd nodes are no ordinary nodes in libxml2. The support for these nodes in XML::LibXML is still limited. In particular one may not want use common node function on doctype declaration nodes!</para> </listitem> </varlistentry> <varlistentry> <term>setExternalSubset</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc->setExternalSubset($dtd);</funcsynopsisinfo> </funcsynopsis> <para><emphasis>EXPERIMENTAL!</emphasis></para> <para>This method sets a DTD node as an external subset of the given document.</para> </listitem> </varlistentry> <varlistentry> <term>setInternalSubset</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$doc->setInternalSubset($dtd);</funcsynopsisinfo> </funcsynopsis> <para><emphasis>EXPERIMENTAL!</emphasis></para> <para>This method sets a DTD node as an internal subset of the given document.</para> </listitem> </varlistentry> <varlistentry> <term>removeExternalSubset</term> <listitem> <funcsynopsis> <funcsynopsisinfo>my $dtd = $doc->removeExternalSubset();</funcsynopsisinfo> </funcsynopsis> <para><emphasis>EXPERIMENTAL!</emphasis></para> <para>If a document has an external subset defined it can be removed from the document by using this function. The removed dtd node will be returned.</para> </listitem> </varlistentry> <varlistentry> <term>removeInternalSubset</term> <listitem> <funcsynopsis> <funcsynopsisinfo>my $dtd = $doc->removeInternalSubset();</funcsynopsisinfo> </funcsynopsis> <para><emphasis>EXPERIMENTAL!</emphasis></para> <para>If a document has an internal subset defined it can be removed from the document by using this function. The removed dtd node will be returned.</para> </listitem> </varlistentry> <varlistentry> <term>getElementsByTagName</term> <listitem> <funcsynopsis> <funcsynopsisinfo>my @nodelist = $doc->getElementsByTagName($tagname);</funcsynopsisinfo> </funcsynopsis> <para>Implements the DOM Level 2 function</para> <para>In SCALAR context this function returns a <function>XML::LibXML::NodeList</function> object.</para> </listitem> </varlistentry> <varlistentry> <term>getElementsByTagNameNS</term> <listitem> <funcsynopsis> <funcsynopsisinfo>my @nodelist = $doc->getElementsByTagName($nsURI,$tagname);</funcsynopsisinfo> </funcsynopsis> <para>Implements the DOM Level 2 function</para> <para>In SCALAR context this function returns a <function>XML::LibXML::NodeList</function> object.</para> </listitem> </varlistentry> <varlistentry> <term>getElementsByLocalName</term> <listitem> <funcsynopsis> <funcsynopsisinfo>my @nodelist = $doc->getElementsByLocalName($localname);</funcsynopsisinfo> </funcsynopsis> <para>This allows the fetching of all nodes from a given document with the given Localname.</para> <para>In SCALAR context this function returns a <function>XML::LibXML::NodeList</function> object.</para> </listitem> </varlistentry> <varlistentry> <term>getElementsById</term> <listitem> <funcsynopsis> <funcsynopsisinfo>my $node = $doc->getElementsById($id);</funcsynopsisinfo> </funcsynopsis> <para>This allows the fetching of the node at a given position in the DOM.</para> <para>Note: The Id of a node might change while manipulating the document.</para> </listitem> </varlistentry> <varlistentry> <term>indexElements</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$dom->indexElements();</funcsynopsisinfo> </funcsynopsis> <para>This function causes libxml2 to stamp all elements in a document with their document position index which considerably speeds up XPath queries for large documents. It should only be used with static documents that won't be further changed by any DOM methods, because once a document is indexed, XPath will always prefer the index to other methods of determining the document order of nodes. XPath could therefore return improperly ordered node-lists when applied on a document that has been changed after being indexed. It is of course possible to use this method to re-index a modified document before using it with XPath again. This function is not a part of the DOM specification.</para> <para>This function returns number of elements indexed, -1 if error occurred, or -2 if this feature is not available in the running libxml2.</para> </listitem> </varlistentry> </variablelist> </chapter> <chapter> <title>Abstract Base Class of XML::LibXML Nodes</title> <titleabbrev>XML::LibXML::Node</titleabbrev> <para>XML::LibXML::Node defines functions that are common to all Node Types. A LibXML::Node should never be created standalone, but as an instance of a high level class such as LibXML::Element or LibXML::Text. The class itself should provide only common functionality. In XML::LibXML each node is part either of a document or a document-fragment. Because of this there is no node without a parent. This may causes confusion with "unbound" nodes.</para> <variablelist> <varlistentry> <term>nodeName</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$name = $node->nodeName;</funcsynopsisinfo> </funcsynopsis> <para>Returns the node's name. This Function is aware of namesaces and returns the full name of the current node (<function>prefix:localname</function>)</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>setNodeName</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->setNodeName( $newName );</funcsynopsisinfo> </funcsynopsis> <para>In very limited situations, it is useful to change a nodes name. In the DOM specification this should throw an error. This Function is aware of namespaces.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>isSameNode</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$bool = $node->isSameNode( $other_node );</funcsynopsisinfo> </funcsynopsis> <para>returns TRUE (1) if the given nodes refer to the same node structure, otherwise FALSE (0) is returned.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>isEqual</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$bool = $node->isEqual( $other_node );</funcsynopsisinfo> </funcsynopsis> <para>deprecated version of isSameNode().</para> <para><emphasis>NOTE</emphasis> isEqual will change behaviour to follow the DOM specification</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>nodeValue</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$content = $node->nodeValue;</funcsynopsisinfo> </funcsynopsis> <para>If the node has any content (such as stored in a <function>text node</function>) it can get requested through this function.</para> <para><emphasis>NOTE:</emphasis> Element Nodes have no content per definition. To get the text value of an Element use textContent() instead!</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>textContent</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$content = $node->textContent;</funcsynopsisinfo> </funcsynopsis> <para>this function returns the content of all text nodes in the descendants of the given node as spacified in DOM.</para> </listitem> </varlistentry> <varlistentry> <term>line_number</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$lineno = $node->line_number();</funcsynopsisinfo> </funcsynopsis> <para>This function returns the line number where the tag was found during parsing. If a node is added to the document the line number is 0. Problems may occour, if a node from one document is passed to another one.</para> <para>Note: line_number() is special to XML::LibXML and not part of the DOM specification.</para> <para>If the line_numbers flag of the parser was not activated before parsing, line_number() will always return 0.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>nodeType</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$type = $node->nodeType;</funcsynopsisinfo> </funcsynopsis> <para>Retrun the node's type. The possible types are described in the libxml2 <emphasis>tree.h</emphasis> documentation. The return value of this function is a numeric value. Therefore it differs from the result of perl ref function.</para> </listitem> </varlistentry> <varlistentry> <term>line_number</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$lineno = $node->line_number();</funcsynopsisinfo> </funcsynopsis> <para>This function returns the line number where the tag was found during parsing. If a node is added to the document the line number is 0. Problems may occur, if a node from one document is passed to another one.</para> <para>Note: line_number() is special to XML::LibXML and not part of the DOM specification.</para> <para>If the line_numbers flag of the parser was not activated before parsing, line_number() will always return 0.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>unbindNode</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->unbindNode()</funcsynopsisinfo> </funcsynopsis> <para>Unbinds the Node from its siblings and Parent, but not from the Document it belongs to. If the node is not inserted into the DOM afterwards it will be lost after the programm terminated. From a low level view, the unbound node is stripped from the context it is and inserted into a (hidden) document-fragment.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>removeChild</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$childnode = $node->removeChild( $childnode )</funcsynopsisinfo> </funcsynopsis> <para>This will unbind the Child Node from its parent <function>$node</function>. The function returns the unbound node. If <function>oldNode</function> is not a child of the given Node the function will fail.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>replaceChild</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$oldnode = $node->replaceChild( $newNode, $oldNode )</funcsynopsisinfo> </funcsynopsis> <para>Replaces the <function>$oldNode</function> with the <function>$newNode</function>. The <function>$oldNode</function> will be unbound from the Node. This function differs from the DOM L2 specification, in the case, if the new node is not part of the document, the node will be imported first.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>replaceNode</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->replaceNode($newNode);</funcsynopsisinfo> </funcsynopsis> <para>This function is very similar to replaceChild(), but it replaces the node itself rather than a childnode. This is useful if a node found by any XPath function, should be replaced.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>appendChild</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$childnode = $node->appendChild( $childnode );</funcsynopsisinfo> </funcsynopsis> <para>The function will add the <function>$childnode</function> to the end of <function>$node</function>'s children. The function should fail, if the new childnode is allready a child of <function>$node</function>. This function differs from the DOM L2 specification, in the case, if the new node is not part of the document, the node will be imported first.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>addChild</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$childnode = $node->addChild( $chilnode );</funcsynopsisinfo> </funcsynopsis> <para>As an alternative to appendChild() one can use the addChild() function. This function is a bit faster, because it avoids all DOM conformity checks. Therefore this function is quite useful if one builds XML documents in memory where the order and ownership (<function>ownerDocument</function>) is assured.</para> <para>addChild() uses libxml2's own xmlAddChild() function. Thus it has to be used with extra care: If a text node is added to a node and the node itself or its last childnode is as well a text node, the node to add will be merged with the one already available. The current node will be removed from memory after this action. Because perl is not aware of this action, the perl instance is still available. XML::LibXML will catch the loss of a node and refuse to run any function called on that node.</para> <programlisting> my $t1 = $doc->createTextNode( "foo" ); my $t2 = $doc->createTextNode( "bar" ); $t1->addChild( $t2 ); # is ok my $val = $t2->nodeValue(); # will fail, script dies</programlisting> <para>Also addChild() will not check it the added node belongs to the same document as the node it will be added to. This could lead to inconsistent documents and in more worse cases even to memory violations, if one does not keep track of this issue.</para> <para>Although this sounds like a lot of trouble, addChild() is useful if a document is built from a stream, such as happens sometimes in SAX handlers or filters.</para> <para>If you are not sure about the source of your nodes, you better stay with appendChild(), because this function is more user friendly in the sense of being more error tolerant.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>addNewChild</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node = $parent->addNewChild( $nsURI, $name );</funcsynopsisinfo> </funcsynopsis> <para>Similar to <function>addChild()</function>, this function uses low level libxml2 functionality to provide faster interface for DOM building. <emphasis>addNewChild()</emphasis> uses <function>xmlNewChild()</function> to create a new node on a given parent element.</para> <para>addNewChild() has two parameters $nsURI and $name, where $nsURI is an (optional) namespace URI. $name is the fully qualified element name; addNewChild() will determine the correct prefix if nessecary.</para> <para>The function returns the newly created node.</para> <para>This function is very useful for DOM building, where a created node can be directly associated with its parent. <emphasis>NOTE</emphasis> this function is not part of the DOM specification and its use will limit your code to XML::LibXML.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>addSibling</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->addSibling($newNode);</funcsynopsisinfo> </funcsynopsis> <para>addSibling() allows adding an additional node to the end of a nodelist, defined by the given node.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>cloneNode</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$newnode =$node->cloneNode( $deep )</funcsynopsisinfo> </funcsynopsis> <para><emphasis>cloneNode</emphasis> creates a copy of <function>$node</function>. When $deep is set to 1 (true) the function will copy all childnodes as well. If $deep is 0 only the current node will be copied.</para> <para><emphasis>cloneNode</emphasis> will not copy any namespace information if it is not run recursivly.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>parentNode</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$parentnode = $node->parentNode;</funcsynopsisinfo> </funcsynopsis> <para>Returns simply the Parent Node of the current node.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>nextSibling</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$nextnode = $node->nextSibling()</funcsynopsisinfo> </funcsynopsis> <para>Returns the next sibling if any .</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>previousSibling</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$prevnode = $node->previousSibling()</funcsynopsisinfo> </funcsynopsis> <para>Analogous to <emphasis>getNextSibling</emphasis> the function returns the previous sibling if any.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>hasChildNodes</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$boolean = $node->hasChildNodes();</funcsynopsisinfo> </funcsynopsis> <para>If the current node has Childnodes this function returns TRUE (1), otherwise it returns FALSE (0, not undef).</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>firstChild</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$childnode = $node->firstChild;</funcsynopsisinfo> </funcsynopsis> <para>If a node has childnodes this function will return the first node in the childlist.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>lastChild</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$childnode = $node->lastChild;</funcsynopsisinfo> </funcsynopsis> <para>If the <function>$node</function> has childnodes this function returns the last child node.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>ownerDocument</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$documentnode = $node->ownerDocument;</funcsynopsisinfo> </funcsynopsis> <para>Through this function it is always possible to access the document the current node is bound to.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>getOwner</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node = $node->getOwner;</funcsynopsisinfo> </funcsynopsis> <para>This function returns the node the current node is associated with. In most cases this will be a document node or a document fragment node.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>setOwnerDocument</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->setOwnerDocument( $doc );</funcsynopsisinfo> </funcsynopsis> <para>This function binds a node to another DOM. This method unbinds the node first, if it is allready bound to another document.</para> <para>This function is the oposite calling of XML::LibXML::Document's adoptNode() function. Because of this it has the same limitations with Entity References as adoptNode().</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>insertBefore</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->insertBefore( $newNode, $refNode )</funcsynopsisinfo> </funcsynopsis> <para>The method inserts <function>$newNode</function> before <function>$refNode</function>. If <function>$refNode</function> is undefined, the newNode will be set as the new last child of the parent node. This function differs from the DOM L2 specification, in the case, if the new node is not part of the document, the node will be imported first, automatically.</para> <para>$refNode has to be passed to the function even if it is undefined:</para> <programlisting> $node->insertBefore( $newNode, undef ); # the same as $node->appendChild( $newNode ); $node->insertBefore( $newNode ); # wrong</programlisting> <para>Note, that the reference node has to be a direct child of the node the function is called on. Also, $newChild is not allowed to be an ancestor of the new parent node.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>insertAfter</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->insertAfter( $newNode, $refNode )</funcsynopsisinfo> </funcsynopsis> <para>The method inserts <function>$newNode</function> after <function>$refNode</function>. If <function>$refNode</function> is undefined, the newNode will be set as the new last child of the parent node.</para> <para>Note, that $refNode has to be passed explicitly even if it is undef.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>findnodes</term> <listitem> <funcsynopsis> <funcsynopsisinfo>@nodes = $node->findnodes( $xpath_statement );</funcsynopsisinfo> </funcsynopsis> <para><emphasis>findnodes</emphasis> performs the xpath statement on the current node and returns the result as an array. In scalar context returns a <function>XML::LibXML::NodeList</function> object.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>find</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$result = $node->find( $xpath );</funcsynopsisinfo> </funcsynopsis> <para><emphasis>find</emphasis> performs the xpath expression using the current node as the context of the expression, and returns the result depending on what type of result the XPath expression had. For example, the XPath "1 * 3 + 52" results in a <function>XML::LibXML::Number</function> object being returned. Other expressions might return a <function>XML::LibXML::Boolean</function> object, or a <function>XML::LibXML::Literal</function> object (a string). Each of those objects uses Perl's overload feature to "do the right thing" in different contexts.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>findvalue</term> <listitem> <funcsynopsis> <funcsynopsisinfo>print $node->findvalue( $xpath );</funcsynopsisinfo> </funcsynopsis> <para><emphasis>findvalue</emphasis> is exactly equivalent to:</para> <programlisting> $node->find( $xpath )->to_literal; </programlisting> <para>That is, it returns the literal value of the results. This enables you to ensure that you get a string back from your search, allowing certain shortcuts. This could be used as the equivalent of XSLT's <xsl:value-of select="some_xpath"/>.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>childNodes</term> <listitem> <funcsynopsis> <funcsynopsisinfo>@childnodes = $node->childNodes;</funcsynopsisinfo> </funcsynopsis> <para><emphasis>getChildnodes</emphasis> implements a more intuitive interface to the childnodes of the current node. It enables you to pass all children directly to a <function>map</function> or <function>grep</function>. If this function is called in scalar context, a <function>XML::LibXML::NodeList</function> object will be returned.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>toString</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$xmlstring = $node->toString($format,$docencoding);</funcsynopsisinfo> </funcsynopsis> <para>This is the equivalent to <function>XML::LibXML::Document::toString</function> for a single node. This means a node and all its childnodes will be dumped into the result string.</para> <para>Additionally to the $format flag of XML::LibXML::Document, this version accepts the optional $docencoding flag. If this flag is set this function returns the string in its original encoding (the encoding of the document) rather than UTF8.</para> </listitem> </varlistentry> <varlistentry> <term>toStringC14N</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$c14nstring = $node->toString($with_comments, $xpath_expression);</funcsynopsisinfo> </funcsynopsis> <para>The function is similar to toString(). Instead of simply serializing the document tree, it transforms it as it is specified in the XML-C14N Specification. Such transformation is known as canonization.</para> <para>If $with_comments is 0 or not defined, the result-document will not contain any comments that exist in the original document. To include comments into the canonized document, $with_comments has to be set to 1.</para> <para>The parameter $xpath_expression defines the nodeset of nodes that should be visible in the resulting document. This can be used to filter out some nodes. One has to note, that only the nodes that are part of the nodeset, will be included into the result-document. Their child-nodes will not exist in the resulting document, unless they are part of the nodeset defined by the xpath expression.</para> <para>If $xpath_expression is ommitted or empty, toStringC14N() will include all nodes in the given sub-tree.</para> <para>No serializing flags will be recognized by this function!</para> </listitem> </varlistentry> <varlistentry> <term>serialize</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$str = $doc->serialze($format); </funcsynopsisinfo> </funcsynopsis> <para>Alternative form of toString(). This function name added to be more conform with libxml2's examples.</para> </listitem> </varlistentry> <varlistentry> <term>serialize_c14n</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$c14nstr = $doc->serialize_c14n($comment_flag,$xpath); </funcsynopsisinfo> </funcsynopsis> <para>Alternative form of toStringC14N().</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>localname</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$localname = $node->localname;</funcsynopsisinfo> </funcsynopsis> <para>Returns the local name of a tag. This is the part behind the colon.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>prefix</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$nameprefix = $node->prefix;</funcsynopsisinfo> </funcsynopsis> <para>Returns the prefix of a tag. This is the part before the colon.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>namespaceURI</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$uri = $node->namespaceURI()</funcsynopsisinfo> </funcsynopsis> <para>returns the URI of the current namespace.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>hasAttributes</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$boolean = $node->hasAttributes();</funcsynopsisinfo> </funcsynopsis> <para>returns 1 (TRUE) if the current node has any attributes set, otherwise 0 (FALSE) is returned.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>attributes</term> <listitem> <funcsynopsis> <funcsynopsisinfo>@attributelist = $node->attributes();</funcsynopsisinfo> </funcsynopsis> <para>This function returns all attributes and namespace declarations assigned to the given node.</para> <para>Because XML::LibXML does not implement namespace declarations and attributes the same way, it is required to test what kind of node is handled while accessing the functions result.</para> <para>If this function is called in array context the attribute nodes are returned as an array. In scalar context the function will return a <function>XML::LibXML::NamedNodeMap</function> object.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>lookupNamespaceURI</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$URI = $node->lookupNamespaceURI( $prefix );</funcsynopsisinfo> </funcsynopsis> <para>Find a namespace URI by its prefix starting at the current node.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>lookupNamespacePrefix</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$prefix = $node->lookupNamespacePrefix( $URI );</funcsynopsisinfo> </funcsynopsis> <para>Find a namespace prefix by its URI starting at the current node.</para> <para><emphasis>NOTE</emphasis> Only the namespace URIs are meant to be unique. The prefix is only document related. Also the document might have more than a single prefix defined for a namespace.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>iterator</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$iter = $node->iterator;</funcsynopsisinfo> </funcsynopsis> <para>This function is deprecated since XML::LibXML 1.54. It is only a dummy function that will get removed entirely in one of the next versions.</para> <para>To make use of iterator functions use XML::LibXML::Iterator Module available on CPAN.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>normalize</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->normalize;</funcsynopsisinfo> </funcsynopsis> <para>This function normalizes adjacent textnodes. This function is not as strict as libxml2's xmlTextMerge() function, since it will not free a node that is still referenced by the perl layer.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>getNamespaces</term> <listitem> <funcsynopsis> <funcsynopsisinfo>@nslist = $node->getNamespaces;</funcsynopsisinfo> </funcsynopsis> <para>If a node has any namespaces defined, this function will return these namespaces. Note, that this will not return all namespaces that are in scope, but only the ones declared explicitly for that node.</para> <para>Although getNamespaces is available for all nodes, it only makes sense if used with element nodes.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>removeChildNodes</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->removeChildNodes();</funcsynopsisinfo> </funcsynopsis> <para>This function is not specified for any DOM level: It removes all childnodes from a node in a single step. Other than the libxml2 function itself (xmlFreeNodeList), this function will not immediately remove the nodes from the memory. This saves one from getting memory violations, if there are nodes still referred to from the Perl level.</para> </listitem> </varlistentry> </variablelist> </chapter> <chapter> <title>XML::LibXML Class for Element Nodes</title> <titleabbrev>XML::LibXML::Element</titleabbrev> <variablelist> <varlistentry> <term>new</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node = XML::LibXML::Element->new( $name )</funcsynopsisinfo> </funcsynopsis> <para>This function creates a new node unbound to any DOM.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>setAttribute</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->setAttribute( $aname, $avalue );</funcsynopsisinfo> </funcsynopsis> <para>This method sets or replaces the node's attribute <function>$aname</function> to the value <function>$avalue</function></para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>setAttributeNS</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->setAttributeNS( $nsURI, $aname, $avalue );</funcsynopsisinfo> </funcsynopsis> <para>Namespaceversion of <function>setAttribute</function>.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>getAttribute</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$avalue = $node->getAttribute( $aname );</funcsynopsisinfo> </funcsynopsis> <para>If <function>$node</function> has an attribute with the name <function>$aname</function>, the value of this attribute will get returned.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>getAttributeNS</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$avalue = $node->setAttributeNS( $nsURI, $aname );</funcsynopsisinfo> </funcsynopsis> <para>Namespaceversion of <function>getAttribute</function>.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>getAttributeNode</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$attrnode = $node->getAttributeNode( $aname );</funcsynopsisinfo> </funcsynopsis> <para>Returns the attribute as a node if the attribute exists. If the Attribute does not exists <function>undef</function> will be returned.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>getAttributeNodeNS</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$attrnode = $node->getAttributeNodeNS( $namespaceURI, $aname );</funcsynopsisinfo> </funcsynopsis> <para>Namespaceversion of <function>getAttributeNode</function>.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>removeAttribute</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->removeAttribute( $aname );</funcsynopsisinfo> </funcsynopsis> <para>The method removes the attribute <function>$aname</function> from the node's attribute list, if the attribute can be found.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>removeAttributeNS</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->removeAttributeNS( $nsURI, $aname );</funcsynopsisinfo> </funcsynopsis> <para>Namespace version of <function>removeAttribute</function></para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>hasAttribute</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$boolean = $node->hasAttribute( $aname );</funcsynopsisinfo> </funcsynopsis> <para>This funcion tests if the named attribute is set for the node. If the attribute is specified, TRUE (1) will be returned, otherwise the returnvalue is FALSE (0).</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>hasAttributeNS</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$boolean = $node->hasAttributeNS( $nsURI, $aname );</funcsynopsisinfo> </funcsynopsis> <para>namespace version of <function>hasAttribute</function></para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>getChildrenByTagName</term> <listitem> <funcsynopsis> <funcsynopsisinfo>@nodes = $node->getChildrenByTagName($tagname);</funcsynopsisinfo> </funcsynopsis> <para>The function gives direct access to all childnodes of the current node with the same tagname. It makes things a lot easier if you need to handle big datasets.</para> <para>If this function is called in SCALAR context, it returns the number of Elements found.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>getChildrenByTagNameNS</term> <listitem> <funcsynopsis> <funcsynopsisinfo>@nodes = $node->getChildrenByTagNameNS($nsURI,$tagname);</funcsynopsisinfo> </funcsynopsis> <para>Namespace version of <function>getChildrenByTagName</function>.</para> <para>If this function is called in SCALAR context, it returns the number of Elements found.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>getElementsByTagName</term> <listitem> <funcsynopsis> <funcsynopsisinfo>@nodes = $node->;getElementsByTagName($tagname);</funcsynopsisinfo> </funcsynopsis> <para>This function is part of the spec it fetches all descendants of a node with a given tagname. If one is as confused with <function>tagname</function> as I was, tagname is a qualified tagname which is in case of namespace useage prefix and local name</para> <para>In SCALAR context this function returns a <function>XML::LibXML::NodeList</function> object.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>getElementsByTagNameNS</term> <listitem> <funcsynopsis> <funcsynopsisinfo>@nodes = $node->getElementsByTagNameNS($nsURI,$localname);</funcsynopsisinfo> </funcsynopsis> <para>Namespace version of <function>getElementsByTagName</function> as found in the DOM spec.</para> <para>In SCALAR context this function returns a <function>XML::LibXML::NodeList</function> object.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>getElementsByLocalName</term> <listitem> <funcsynopsis> <funcsynopsisinfo>@nodes = $node->getElementsByLocalName($localname);</funcsynopsisinfo> </funcsynopsis> <para>This function is not found in the DOM specification. It is a mix of getElementsByTagName and getElementsByTagNameNS. It will fetch all tags matching the given local-name. This alows one to select tags with the same local name across namespace borders.</para> <para>In SCALAR context this function returns a <function>XML::LibXML::NodeList</function> object.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>appendWellBalancedChunk</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->appendWellBalancedChunk( $chunk )</funcsynopsisinfo> </funcsynopsis> <para>Sometimes it is nessecary to append a string coded XML Tree to a node. <emphasis>appendWellBalancedChunk</emphasis> will do the trick for you. But this is only done if the String is <function>well-balanced</function>.</para> <para><emphasis>Note that appendWellBalancedChunk() is only left for compatibility reasons</emphasis>. Implicitly it uses</para> <programlisting> my $fragment = $parser->parse_xml_chunk( $chunk ); $node->appendChild( $fragment );</programlisting> <para>This form is more explicit and makes it easier to control the flow of a script.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>appendText</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->appendText( $PCDATA );</funcsynopsisinfo> </funcsynopsis> <para>alias for appendTextNode().</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>appendTextNode</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->appendTextNode( $PCDATA );</funcsynopsisinfo> </funcsynopsis> <para>This wrapper function lets you add a string directly to an element node.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>appendTextChild</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->appendTextChild( $childname , $PCDATA )</funcsynopsisinfo> </funcsynopsis> <para>Somewhat similar with <function>appendTextNode</function>: It lets you set an Element, that contains only a <function>text node</function> directly by specifying the name and the text content.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>setNamespace</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node->setNamespace( $nsURI , $nsPrefix, $activate )</funcsynopsisinfo> </funcsynopsis> <para>setNamespace() allows one to apply a namespace to an element. The function takes three parameters: 1. the namespace URI, which is required and the two optional values prefix, which is the namespace prefix, as it should be used in child elements or attributes as well as the additional activate parameter.</para> <para>The activate parameter is most useful: If this parameter is set to FALSE (0), the namespace is simply added to the namespacelist of the node, while the element's namespace itself is not altered. Nevertheless activate is set to TRUE (1) on default. In this case the namespace automatically is used as the nodes effective namespace. This means the namespace prefix is added to the node name and if there was a namespace already active for the node, this will be replaced (but not removed from the global namespace list)</para> <para>The following example may clarify this:</para> <programlisting> my $e1 = $doc->createElement("bar"); $e1->setNamespace("http://foobar.org", "foo")</programlisting> <para>results</para> <programlisting> <foo:bar xmlns:foo="http://foobar.org"/></programlisting> <para>while</para> <programlisting> my $e2 = $doc->createElement("bar"); $e2->setNamespace("http://foobar.org", "foo",0)</programlisting> <para>results only</para> <programlisting> <bar xmlns:foo="http://foobar.org"/></programlisting> <para>By using $activate == 0 it is possible to apply multiple namepace declarations to a single element.</para> <para>Alternativly you can call setAttribute() simply to declare a new namespace for a node, without activating it:</para> <programlisting> $e2->setAttribute( "xmlns:foo", "http://bar.org" );</programlisting> <para>has the same result as</para> <programlisting> $e2->setNamespace( "http://foobar.org", "foo", 0 );</programlisting> </listitem> </varlistentry> </variablelist> </chapter> <chapter> <title>XML::LibXML Class for Text Nodes</title> <titleabbrev>XML::LibXML::Text</titleabbrev> <para>Different to the DOM specification XML::LibXML implements the text node as the base class of all character data node. Therefor there exists no CharacterData class. This allow one to use all methods that are available for textnodes as well for Comments or CDATA-sections.</para> <variablelist> <varlistentry> <term>new</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$text = XML::LibXML::Text->new( $content ); </funcsynopsisinfo> </funcsynopsis> <para>The constuctor of the class. It creates an unbound text node.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>data</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$nodedata = $text->data;</funcsynopsisinfo> </funcsynopsis> <para>Although there exists the <function>nodeValue</function> attribute in the Node class, the DOM specification defines data as a separate attribute. <function>XML::LibXML</function> implements these two attributes not as different attributes, but as aliases, such as <function>libxml2</function> does. Therefore</para> <programlisting> $text->data;</programlisting> <para>and</para> <programlisting> $text->nodeValue;</programlisting> <para>will have the same result and are not different entities.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>setData($string)</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$text->setData( $text_content );</funcsynopsisinfo> </funcsynopsis> <para>This function sets or replaces text content to a node. The node has to be of the type "text", "cdata" or "comment".</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>substringData($offset,$length)</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$text->substringData($offset, $length);</funcsynopsisinfo> </funcsynopsis> <para>Extracts a range of data from the node. (DOM Spec) This function takes the two parameters $offset and $length and returns the substring, if available.</para> <para>If the node contains no data or $offset refers to an nonexisting string index, this function will return <emphasis>undef</emphasis>. If $length is out of range <function>substringData</function> will return the data starting at $offset instead of causing an error.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>appendData($string)</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$text->appendData( $somedata );</funcsynopsisinfo> </funcsynopsis> <para>Appends a string to the end of the existing data. If the current text node contains no data, this function has the same effect as <function>setData</function>.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>insertData($offset,$string)</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$text->insertData($offset, $string);</funcsynopsisinfo> </funcsynopsis> <para>Inserts the parameter $string at the given $offset of the existing data of the node. This operation will not remove existing data, but change the order of the existing data.</para> <para>The $offset has to be a positive value. If $offset is out of range, <function>insertData</function> will have the same behaviour as <function>appendData</function>.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>deleteData($offset, $length)</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$text->deleteData($offset, $length);</funcsynopsisinfo> </funcsynopsis> <para>This method removes a chunk from the existing node data at the given offset. The $length parameter tells, how many characters should be removed from the string.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>deleteDataString($string, [$all])</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$text->deleteDataString($remstring, $all);</funcsynopsisinfo> </funcsynopsis> <para>This method removes a chunk from the existing node data. Since the DOM spec is quite unhandy if you already know <function>which</function> string to remove from a text node, this method allows more perlish code :)</para> <para>The functions takes two parameters: <emphasis>$string</emphasis> and optional the <emphasis>$all</emphasis> flag. If $all is not set, <emphasis>undef</emphasis> or <emphasis>0</emphasis>, <function>deleteDataString</function> will remove only the first occourance of $string. If $all is <emphasis>TRUE</emphasis> <function>deleteDataString</function> will remove all occurrences of <emphasis>$string</emphasis> from the node data.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>replaceData($offset, $length, $string)</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$text->replaceData($offset, $length, $string);</funcsynopsisinfo> </funcsynopsis> <para>The DOM style version to replace node data.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>replaceDataString($oldstring, $newstring, [$all])</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$text->replaceDataString($old, $new, $flag);</funcsynopsisinfo> </funcsynopsis> <para>The more programmer friendly version of replaceData() :)</para> <para>Instead of giving offsets and length one can specify the exact string (<emphasis>$oldstring</emphasis>) to be replaced. Additionally the <emphasis>$all</emphasis> flag allows to replace all occourences of <emphasis>$oldstring</emphasis>.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>replaceDataRegEx( $search_cond, $replace_cond, $reflags )</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$text->replaceDataRegEx( $search_cond, $replace_cond, $reflags );</funcsynopsisinfo> </funcsynopsis> <para>This method replaces the node's data by a <function>simple</function> regular expression. Optional, this function allows to pass some flags that will be added as flag to the replace statement.</para> <para><emphasis>NOTE:</emphasis> This is a shortcut for</para> <programlisting> my $datastr = $node->getData(); $datastr =~ s/somecond/replacement/g; # 'g' is just an example for any flag $node->setData( $datastr );</programlisting> <para>This function can make things easier to read for simple replacements. For more complex variants it is recommended to use the code snippet above.</para> </listitem> </varlistentry> </variablelist> </chapter> <chapter> <title>XML::LibXML Comment Class</title> <titleabbrev>XML::LibXML::Comment</titleabbrev> <para>This class provides all functions of <emphasis>XML::LibXML::Text</emphasis>, but for comment nodes. This can be done, since only the output of the nodetypes is different, but not the datastructure. :-)</para> <variablelist> <varlistentry> <term>new</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node = XML::LibXML::Comment( $content );</funcsynopsisinfo> </funcsynopsis> <para>The constructor is the only provided function for this package. It is required, because <emphasis>libxml2</emphasis> treats text nodes and comment nodes slightly differently.</para> </listitem> </varlistentry> </variablelist> </chapter> <chapter> <title>XML::LibXML Class for CDATA Sections</title> <titleabbrev>XML::LibXML::CDATASection</titleabbrev> <para>This class provides all functions of <emphasis>XML::LibXML::Text</emphasis>, but for CDATA nodes.</para> <variablelist> <varlistentry> <term>new</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node = XML::LibXML::CDATASection( $content );</funcsynopsisinfo> </funcsynopsis> <para>The constructor is the only provided function for this package. It is required, because <emphasis>libxml2</emphasis> treats the different textnode types slightly differently.</para> </listitem> </varlistentry> </variablelist> </chapter> <chapter> <title>XML::LibXML Attribute Class</title> <titleabbrev>XML::LibXML::Attr</titleabbrev> <para>This is the interface to handle Attributes like ordinary nodes. The naming of the class relies on the W3C DOM documentation.</para> <variablelist> <varlistentry> <term>new</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$attr = XML::LibXML::Attr->new($name [,$value]);</funcsynopsisinfo> </funcsynopsis> <para>Class constructor. If you need to work with iso encoded strings, you should <emphasis>always</emphasis> use the <function>createAttrbute</function> of <emphasis>XML::LibXML::Document</emphasis>.</para> </listitem> </varlistentry> <varlistentry> <term>getValue</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$string = $attr->getValue();</funcsynopsisinfo> </funcsynopsis> <para>Returns the value stored for the attribute. If undef is returned, the attribute has no value, which is different of being <function>not specified</function>.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>value</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$value = $attr->value;</funcsynopsisinfo> </funcsynopsis> <para>Alias for <emphasis>getValue()</emphasis></para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>setValue</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$attr->setValue( $string );</funcsynopsisinfo> </funcsynopsis> <para>This is needed to set a new attribute value. If iso encoded strings are passed as parameter, the node has to be bound to a document, otherwise the encoding might be done incorrectly.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>getOwnerElement</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$node = $attr->getOwnerElement();</funcsynopsisinfo> </funcsynopsis> <para>returns the node the attribute belongs to. If the attribute is not bound to a node, undef will be returned. Overwriting the underlying implementation, the <emphasis>parentNode</emphasis> function will return undef, instead of the owner element.</para> </listitem> </varlistentry> </variablelist> <variablelist> <varlistentry> <term>setNamespace</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$attr->setNamespace($nsURI, $prefix);</funcsynopsisinfo> </funcsynopsis> <para>This function activates a namespace for the given attribute. If the attribute was not previously declared in the context of the attribute this function will be silently ignored. In this case you may wish to call setNamespace() on the ownerElement.</para> </listitem> </varlistentry> </variablelist> </chapter> <chapter> <title>XML::LibXML's DOM L2 Document Fragment Implementation</title> <titleabbrev>XML::LibXML::DocumentFragment</titleabbrev> <para>This class is a helper class as described in the DOM Level 2 Specification. It is implemented as a node without name. All adding, inserting or replacing functions are aware of document fragments now.</para> <para>As well <emphasis>all</emphasis> unbound nodes (all nodes that do not belong to any document subtree) are implicit members of document fragments.</para> </chapter> <chapter> <title>XML::LibXML Namespace Implementation</title> <titleabbrev>XML::LibXML::Namespace</titleabbrev> <para>Namespace nodes are returned by both $element->findnodes('namespace::foo') or by $node->getNamespaces().</para> <para>The namespace node API is not part of any current DOM API, and so it is quite minimal. It should be noted that namespace nodes are <emphasis>not</emphasis> a sub class of XML::LibXML::Node, however Namespace nodes act a lot like attribute nodes, and similarly named methods will return what you would expect if you treated the namespace node as an attribute.</para> <variablelist> <varlistentry> <term>new</term> <listitem> <funcsynopsis> <funcsynopsisinfo>my $ns = XML::LibXML::Namespace->new($nsURI);</funcsynopsisinfo> </funcsynopsis> <para>Creates a new Namespace node. Note that this is not a 'node' as an attribute or an element node. Therefore you can't do call all XML::LibXML::Node Functions. All functions available for this node are listed below.</para> <para>Optionally you can pass the prefix to the namespace constructor. If this second parameter is omitted you will create a so called default namespace. Note, the newly created namespace is not bound to any docuement or node, therefore you should not expect it to be available in an existing document.</para> </listitem> </varlistentry> <varlistentry> <term>getName</term> <listitem> <funcsynopsis> <funcsynopsisinfo>print $ns->getName()</funcsynopsisinfo> </funcsynopsis> <para>Returns "xmlns:prefix", where prefix is the prefix for this namespace.</para> </listitem> </varlistentry> <varlistentry> <term>name</term> <listitem> <funcsynopsis> <funcsynopsisinfo>print $ns->name()</funcsynopsisinfo> </funcsynopsis> <para>Alias for getName()</para> </listitem> </varlistentry> <varlistentry> <term>prefix</term> <listitem> <funcsynopsis> <funcsynopsisinfo>print $ns->prefix()</funcsynopsisinfo> </funcsynopsis> <para>Returns the prefix bound to this namespace declaration.</para> </listitem> </varlistentry> <varlistentry> <term>getLocalName</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$localname = $ns->getLocalName()</funcsynopsisinfo> </funcsynopsis> <para>Alias for prefix()</para> </listitem> </varlistentry> <varlistentry> <term>getData</term> <listitem> <funcsynopsis> <funcsynopsisinfo>print $ns->getData()</funcsynopsisinfo> </funcsynopsis> <para>Returns the URI of the namespace.</para> </listitem> </varlistentry> <varlistentry> <term>getValue</term> <listitem> <funcsynopsis> <funcsynopsisinfo>print $ns->getValue()</funcsynopsisinfo> </funcsynopsis> <para>Alias for getData()</para> </listitem> </varlistentry> <varlistentry> <term>value</term> <listitem> <funcsynopsis> <funcsynopsisinfo>print $ns->value()</funcsynopsisinfo> </funcsynopsis> <para>Alias for getData()</para> </listitem> </varlistentry> <varlistentry> <term>uri</term> <listitem> <funcsynopsis> <funcsynopsisinfo>print $ns->uri()</funcsynopsisinfo> </funcsynopsis> <para>Alias for getData()</para> </listitem> </varlistentry> <varlistentry> <term>getNamespaceURI</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$known_uri = $ns->getNamespaceURI()</funcsynopsisinfo> </funcsynopsis> <para>Returns the string "http://www.w3.org/2000/xmlns/"</para> </listitem> </varlistentry> <varlistentry> <term>getPrefix</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$known_prefix = $ns->getPredix()</funcsynopsisinfo> </funcsynopsis> <para>Returns the string "xmlns"</para> </listitem> </varlistentry> </variablelist> </chapter> <chapter> <title>XML::LibXML Processing Instructions</title> <titleabbrev>XML::LibXML::PI</titleabbrev> <para>Processing instructions are implemented with XML::LibXML with read and write access. The PI data is the PI without the PI target (as specified in XML 1.0 [17]) as a string. This string can be accessed with getData as implemented in XML::LibXML::Node.</para> <para>The write access is aware about the fact, that many processing instructions have attribute like data. Therefore setData() provides besides the DOM spec conform Interface to pass a set of named parameter. So the code segment</para> <programlisting>my $pi = $dom->createProcessingInstruction("abc"); $pi->setData(foo=>'bar', foobar=>'foobar'); $dom->appendChild( $pi );</programlisting> <para>will result the following PI in the DOM:</para> <programlisting><?abc foo="bar" foobar="foobar"?></programlisting> <para>Which is how it is specified in the DOM specification. This three step interface creates temporary a node in perl space. This can be avoided while using the insertProcessingInstruction() method. Instead of the three calls described above, the call</para> <programlisting>$dom->insertProcessingInstruction("abc",'foo="bar" foobar="foobar"');</programlisting> <para>will have the same result as above.</para> <para>XML::LibXML::PI's implementation of setData() differs a bit from the the standard version as available in XML::LibXML::Node():</para> <variablelist> <varlistentry> <term>setData</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$pinode->setData( $data_string ); $pinode->setData( name=>string_value [...] );</funcsynopsisinfo> </funcsynopsis> <para>This method allows to change the content data of a PI. Additionaly to the interface specified for DOM Level2, the method provides a named parameter interface to set the data. This parameterlist is converted into a string before it is appended to the PI.</para> </listitem> </varlistentry> </variablelist> </chapter> <chapter> <title>XML::LibXML DTD Handling</title> <titleabbrev>XML::LibXML::Dtd</titleabbrev> <para>This class holds a DTD. You may parse a DTD from either a string, or from an external SYSTEM identifier.</para> <para>No support is available as yet for parsing from a filehandle.</para> <para>XML::LibXML::Dtd is a sub-class of Node, so all the methods available to nodes (particularly toString()) are available to Dtd objects.</para> <variablelist> <varlistentry> <term>new</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$dtd = XML::LibXML::Dtd->new($public_id, $system_id)</funcsynopsisinfo> </funcsynopsis> <para>Parse a DTD from the system identifier, and return a DTD object that you can pass to $doc->is_valid() or $doc->validate().</para> <programlisting> my $dtd = XML::LibXML::Dtd->new( "SOME // Public / ID / 1.0", "test.dtd" ); my $doc = XML::LibXML->new->parse_file("test.xml"); $doc->validate($dtd);</programlisting> </listitem> </varlistentry> <varlistentry> <term>parse_string</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$dtd = XML::LibXML::Dtd->parse_string($dtd_str)</funcsynopsisinfo> </funcsynopsis> <para>The same as new() above, except you can parse a DTD from a string.</para> </listitem> </varlistentry> </variablelist> </chapter> <chapter> <title>RelaxNG Schema Validation</title> <titleabbrev>XML::LibXML::RelaxNG</titleabbrev> <para>The XML::LibXML::RelaxNG class is a tiny frontend to libxml2's RelaxNG implementation. Currently it supports only schema parsing and document validation.</para> <variablelist> <varlistentry> <term>new</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$rngschema = XML::LibXML::RelaxNG->new( location => $filename_or_url ); $rngschema = XML::LibXML::RelaxNG->new( string => $xmlschemastring ); $rngschema = XML::LibXML::RelaxNG->new( DOM => $doc );</funcsynopsisinfo> </funcsynopsis> <para>The constructor of XML::LibXML::RelaxNG may get called with either one of three parameters. The parameter tells the class from which source it should generate a validation schema. It is important, that each schema only have a single source.</para> <para>The location parameter allows to parse a schema from the filesystem or a URL.</para> <para>The string parameter will parse the schema from the given XML string.</para> <para>The DOM parameter allows to parse the schema from a preparsed XML::LibXML::Document.</para> <para>Note that the constructor will die() if the schema does not meed the constraints of the RelaxNG specification.</para> </listitem> </varlistentry> <varlistentry> <term>validate</term> <listitem> <funcsynopsis> <funcsynopsisinfo>eval { $rngschema->validate( $doc ); };</funcsynopsisinfo> </funcsynopsis> <para>This function allows to validate a document against the given RelaxNG schema. If this function succeeds, it will return 0, otherwise it will die() and report the errors found. Because of this validate() should be always evaluated.</para> </listitem> </varlistentry> </variablelist> </chapter> <chapter> <title>XML Schema Validation</title> <titleabbrev>XML::LibXML::Schema</titleabbrev> <para>The XML::LibXML::Schema class is a tiny frontend to libxml2's XML Schema implementation. Currently it supports only schema parsing and document validation.</para> <variablelist> <varlistentry> <term>new</term> <listitem> <funcsynopsis> <funcsynopsisinfo>$xmlschema = XML::LibXML::Schema->new( location => $filename_or_url ); $xmlschema = XML::LibXML::Schema->new( string => $xmlschemastring );</funcsynopsisinfo> </funcsynopsis> <para>The constructor of XML::LibXML::Schema may get called with either one of two parameters. The parameter tells the class from which source it should generate a validation schema. It is important, that each schema only have a single source.</para> <para>The location parameter allows to parse a schema from the filesystem or a URL.</para> <para>The string parameter will parse the schema from the given XML string.</para> <para>Note that the constructor will die() if the schema does not meed the constraints of the XML Schema specification.</para> </listitem> </varlistentry> <varlistentry> <term>validate</term> <listitem> <funcsynopsis> <funcsynopsisinfo>eval { $xmlschema->validate( $doc ); };</funcsynopsisinfo> </funcsynopsis> <para>This function allows to validate a document against the given XML Schema. If this function succeeds, it will return 0, otherwise it will die() and report the errors found. Because of this validate() should be always evaluated.</para> </listitem> </varlistentry> </variablelist> </chapter> </book>