Provided by: libxml-easy-perl_0.009-3build1_amd64 bug

NAME

       XML::Easy::Text - XML parsing and serialisation

SYNOPSIS

               use XML::Easy::Text qw(
                       xml10_read_content_object xml10_read_element
                       xml10_read_document xml10_read_extparsedent_object
               );

               $content = xml10_read_content_object($text);
               $element = xml10_read_element($text);
               $element = xml10_read_document($text);
               $content = xml10_read_extparsedent_object($text);

               use XML::Easy::Text qw(
                       xml10_write_content xml10_write_element
                       xml10_write_document xml10_write_extparsedent
               );

               $text = xml10_write_content($content);
               $text = xml10_write_element($element);
               $text = xml10_write_document($element, "UTF-8");
               $text = xml10_write_extparsedent($content, "UTF-8");

DESCRIPTION

       This module supplies functions that parse and serialise XML data according to the XML 1.0
       specification.

       This module is oriented towards the use of XML to represent data for interchange purposes,
       rather than the use of XML as markup of principally textual data.  It does not perform any
       schema processing, and does not interpret DTDs or any other kind of schema.  It adheres
       strictly to the XML specification, in all its awkward details, except for the
       aforementioned DTDs.

       XML data in memory is represented using a tree of XML::Easy::Content and
       XML::Easy::Element objects.  Such a tree encapsulates all the structure and data content
       of an XML element or document, without any irrelevant detail resulting from the textual
       syntax.  These node trees are readily manipulated by the functions in
       XML::Easy::NodeBasics.

       The functions of this module are implemented in C for performance, with a pure Perl backup
       version (which has good performance compared to other pure Perl parsers) for systems that
       can't handle XS modules.

FUNCTIONS

       All functions "die" on error.

   Parsing
       These function take textual XML and extract the abstract XML content.  In the terminology
       of the XML specification, they constitute a non-validating processor: they check for well-
       formedness of the XML, but not for adherence of the content to any schema.

       The inputs (to be parsed) for these functions are always character strings.  XML text is
       frequently encoded using UTF-8, or some other Unicode encoding, so that it can contain
       characters from the full Unicode repertoire.  In that case, something must perform UTF-8
       decoding (or decoding of some other character encoding) to convert the octets of a file to
       the characters on which these functions operate.  A Perl I/O layer can do the job (see
       perlio), or it can be performed explicitly using the "decode" function in the Encode
       module.

       xml10_read_content_object(TEXT)
           TEXT must be a character string.  It is parsed against the content production of the
           XML 1.0 grammar; i.e., as a sequence of the kind of matter that can appear between the
           start-tag and end-tag of an element.  Returns a reference to an XML::Easy::Content
           object.

           Normally one would not want to use this function directly, but prefer the higher-level
           "xml10_read_document" function.  This function exists for the construction of custom
           XML parsers in situations that don't match the full XML grammar.

       xml10_read_content_twine(TEXT)
           Performs the same parsing job as "xml10_read_content_object", but returns the
           resulting content chunk in the form of twine (see "Twine" in XML::Easy::NodeBasics)
           rather than a content object.

           The returned array must not be subsequently modified.  If possible, it will be marked
           as read-only in order to prevent modification.

       xml10_read_content(TEXT)
           Deprecated alias for "xml10_read_content_twine".

       xml10_read_element(TEXT)
           TEXT must be a character string.  It is parsed against the element production of the
           XML 1.0 grammar; i.e., as an item bracketed by tags and containing content that may
           recursively include other elements.  Returns a reference to an XML::Easy::Element
           object.

           Normally one would not want to use this function directly, but prefer the higher-level
           "xml10_read_document" function.  This function exists for the construction of custom
           XML parsers in situations that don't match the full XML grammar.

       xml10_read_document(TEXT)
           TEXT must be a character string.  It is parsed against the document production of the
           XML 1.0 grammar; i.e., as a root element (possibly containing subelements) optionally
           preceded and followed by non-content matter, possibly headed by an XML declaration.
           (A document type declaration is not accepted; this module does not process schemata.)
           Returns a reference to an XML::Easy::Element object which represents the root element.
           Nothing is returned relating to the XML declaration or other non-content matter.

           This is the most likely function to use to process incoming XML data.  Beware that the
           encoding declaration in the XML declaration, if any, does not affect the
           interpretation of the input as a sequence of characters.

       xml10_read_extparsedent_object(TEXT)
           TEXT must be a character string.  It is parsed against the extParsedEnt production of
           the XML 1.0 grammar; i.e., as a sequence of content (containing character data and
           subelements), possibly headed by a text declaration (which is similar to, but not the
           same as, an XML declaration).  Returns a reference to an XML::Easy::Content object.

           This is a relatively obscure part of the XML grammar, used when a subpart of a
           document is stored in a separate file.  You're more likely to require the
           "xml10_read_document" function.

       xml10_read_extparsedent_twine(TEXT)
           Performs the same parsing job as "xml10_read_extparsedent_object", but returns the
           resulting content chunk in the form of twine (see "Twine" in XML::Easy::NodeBasics)
           rather than a content object.

           The returned array must not be subsequently modified.  If possible, it will be marked
           as read-only in order to prevent modification.

       xml10_read_extparsedent(TEXT)
           Deprecated alias for "xml10_read_extparsedent_twine".

   Serialisation
       These function take abstract XML data and serialise it as textual XML.  They do not
       perform indentation, default attribute suppression, or any other schema-dependent
       processing.

       The outputs of these functions are always character strings.  XML text is frequently
       encoded using UTF-8, or some other Unicode encoding, so that it can contain characters
       from the full Unicode repertoire.  In that case, something must perform UTF-8 encoding (or
       encoding of some other character encoding) to convert the characters generated by these
       functions to the octets of a file.  A Perl I/O layer can do the job (see perlio), or it
       can be performed explicitly using the "encode" function in the Encode module.

       xml10_write_content(CONTENT)
           CONTENT must be a reference to either an XML::Easy::Content object or a twine array
           (see "Twine" in XML::Easy::NodeBasics).  The XML 1.0 textual representation of that
           content is returned.

       xml10_write_element(ELEMENT)
           ELEMENT must be a reference to an XML::Easy::Element object.  The XML 1.0 textual
           representation of that element is returned.

       xml10_write_document(ELEMENT[, ENCODING])
           ELEMENT must be a reference to an XML::Easy::Element object.  The XML 1.0 textual form
           of a document with that element as the root element is returned.  The document
           includes an XML declaration.  If ENCODING is supplied, it must be a valid character
           encoding name, and the XML declaration specifies it in an encoding declaration.  (The
           returned string consists of unencoded characters regardless of the encoding
           specified.)

       xml10_write_extparsedent(CONTENT[, ENCODING])
           CONTENT must be a reference to either an XML::Easy::Content object or a twine array
           (see "Twine" in XML::Easy::NodeBasics).  The XML 1.0 textual form of an external
           parsed entity encapsulating that content is returned.  If ENCODING is supplied, it
           must be a valid character encoding name, and the returned entity includes a text
           declaration that specifies the encoding name in an encoding declaration.  (The
           returned string consists of unencoded characters regardless of the encoding
           specified.)

SEE ALSO

       XML::Easy::NodeBasics, XML::Easy::Syntax, <http://www.w3.org/TR/REC-xml/>

AUTHOR

       Andrew Main (Zefram) <zefram@fysh.org>

COPYRIGHT

       Copyright (C) 2008, 2009 PhotoBox Ltd

       Copyright (C) 2009, 2010, 2011 Andrew Main (Zefram) <zefram@fysh.org>

LICENSE

       This module is free software; you can redistribute it and/or modify it under the same
       terms as Perl itself.