Provided by: libhtml-microformats-perl_0.105-2_all bug

NAME

       HTML::Microformats - parse microformats in HTML

SYNOPSIS

        use HTML::Microformats;

        my $doc = HTML::Microformats
                    ->new_document($html, $uri)
                    ->assume_profile(qw(hCard hCalendar));
        print $doc->json(pretty => 1);

        use RDF::TrineShortcuts qw(rdf_query);
        my $results = rdf_query($sparql, $doc->model);

DESCRIPTION

       The HTML::Microformats module is a wrapper for parser and handler modules of various
       individual microformats (each of those modules has a name like
       HTML::Microformats::Format::Foo).

       The general pattern of usage is to create an HTML::Microformats object (which corresponds
       to an HTML document) using the "new_document" method; then ask for the data, as a Perl
       hashref, a JSON string, or an RDF::Trine model.

   Constructor
       "$doc = HTML::Microformats->new_document($html, $uri, %opts)"
           Constructs a document object.

           $html is the HTML or XHTML source (string) or an XML::LibXML::Document.

           $uri is the document URI, important for resolving relative URL references.

           %opts are additional parameters; currently only one option is defined: $opts{'type'}
           is set to 'text/html' or 'application/xhtml+xml', to control how $html is parsed.

   Profile Management
       HTML::Microformats uses HTML profiles (i.e. the profile attribute on the HTML <head>
       element) to detect which Microformats are used on a page. Any microformats which do not
       have a profile URI declared will not be parsed.

       Because many pages fail to properly declare which profiles they use, there are various
       profile management methods to tell HTML::Microformats to assume the presence of particular
       profile URIs, even if they're actually missing.

       "$doc->profiles"
           This method returns a list of profile URIs declared by the document.

       "$doc->has_profile(@profiles)"
           This method returns true if and only if one or more of the profile URIs in @profiles
           is declared by the document.

       "$doc->add_profile(@profiles)"
           Using "add_profile" you can add one or more profile URIs, and they are treated as if
           they were found on the document.

           For example:

            $doc->add_profile('http://microformats.org/profile/rel-tag')

           This is useful for adding profile URIs declared outside the document itself (e.g. in
           HTTP headers).

           Returns a reference to the document.

       "$doc->assume_profile(@microformats)"
           For example:

            $doc->assume_profile(qw(hCard adr geo))

           This method acts similarly to "add_profile" but allows you to use names of
           microformats rather than URIs.

           Microformat names are case sensitive, and must match HTML::Microformats::Format::Foo
           module names.

           Returns   a reference to the document.

       "$doc->assume_all_profiles"
           This method is equivalent to calling "assume_profile" for all known microformats.

           Returns   a reference to the document.

   Parsing Microformats
       Generally speaking, you can skip this. The "data", "json" and "model" methods will
       automatically do this for you.

       "$doc->parse_microformats"
           Scans through the document, finding microformat objects.

           On subsequent calls, does nothing (as everything is already parsed).

           Returns   a reference to the document.

       "$doc->clear_microformats"
           Forgets information gleaned by "parse_microformats" and thus allows
           "parse_microformats" to be run again. This is useful if you've modified added some
           profiles between runs of "parse_microformats".

           Returns   a reference to the document.

   Retrieving Data
       These methods allow you to retrieve the document's data, and do things with it.

       "$doc->objects($format);"
           $format is, for example, 'hCard', 'adr' or 'RelTag'.

           Returns a list of objects of that type. (If called in scalar context, returns an
           arrayref.)

           Each object is, for example, an HTML::Microformat::hCard object, or an
           HTML::Microformat::RelTag object, etc. See the relevent documentation for details.

       "$doc->all_objects"
           Returns a hashref of data. Each hashref key is the name of a microformat (e.g.
           'hCard', 'RelTag', etc), and the values are arrayrefs of objects.

           Each object is, for example, an HTML::Microformat::hCard object, or an
           HTML::Microformat::RelTag object, etc. See the relevent documentation for details.

       "$doc->json(%opts)"
           Returns data roughly equivalent to the "all_objects" method, but as a JSON string.

           %opts is a hash of options, suitable for passing to the JSON module's to_json
           function. The 'convert_blessed' and 'utf8' options are enabled by default, but can be
           disabled by explicitly setting them to 0, e.g.

             print $doc->json( pretty=>1, canonical=>1, utf8=>0 );

       "$doc->model"
           Returns data as an RDF::Trine::Model, suitable for serialising as RDF or running
           SPARQL queries.

       "$object->serialise_model(as => $format)"
           As "model" but returns a string.

       "$doc->add_to_model($model)"
           Adds data to an existing RDF::Trine::Model.

           Returns a reference to the document.

   Utility Functions
       "HTML::Microformats->modules"
           Returns a list of Perl modules, each of which implements a specific microformat.

       "HTML::Microformats->formats"
           As per "modules", but strips 'HTML::Microformats::Format::' off the module name, and
           sorts alphabetically.

WHY ANOTHER MICROFORMATS MODULE?

       There already exist two microformats packages on CPAN (see Text::Microformat and
       Data::Microformat), so why create another?

       Firstly, HTML::Microformats isn't being created from scratch. It's actually a
       fork/clean-up of a non-CPAN application (Swignition), and in that sense predates
       Text::Microformat (though not Data::Microformat).

       It has a number of other features that distinguish it from the existing packages:

       •   It supports more formats.

           HTML::Microformats supports hCard, hCalendar, rel-tag, geo, adr, rel-enclosure, rel-
           license, hReview, hResume, hRecipe, xFolk, XFN, hAtom, hNews and more.

       •   It supports more patterns.

           HTML::Microformats supports the include pattern, abbr pattern, table cell header
           pattern, value excerpting and other intricacies of microformat parsing better than the
           other modules on CPAN.

       •   It offers RDF support.

           One of the key features of HTML::Microformats is that it makes data available as
           RDF::Trine models. This allows your application to benefit from a rich, feature-laden
           Semantic Web toolkit. Data gleaned from microformats can be stored in a triple store;
           output in RDF/XML or Turtle; queried using the SPARQL or RDQL query languages; and
           more.

           If you're not comfortable using RDF, HTML::Microformats also makes all its data
           available as native Perl objects.

BUGS

       Please report any bugs to <http://rt.cpan.org/>.

SEE ALSO

       HTML::Microformats::Documentation::Notes.

       Individual format modules:

       •   HTML::Microformats::Format::adr

       •   HTML::Microformats::Format::figure

       •   HTML::Microformats::Format::geo

       •   HTML::Microformats::Format::hAtom

       •   HTML::Microformats::Format::hAudio

       •   HTML::Microformats::Format::hCalendar

       •   HTML::Microformats::Format::hCard

       •   HTML::Microformats::Format::hListing

       •   HTML::Microformats::Format::hMeasure

       •   HTML::Microformats::Format::hNews

       •   HTML::Microformats::Format::hProduct

       •   HTML::Microformats::Format::hRecipe

       •   HTML::Microformats::Format::hResume

       •   HTML::Microformats::Format::hReview

       •   HTML::Microformats::Format::hReviewAggregate

       •   HTML::Microformats::Format::OpenURL_COinS

       •   HTML::Microformats::Format::RelEnclosure

       •   HTML::Microformats::Format::RelLicense

       •   HTML::Microformats::Format::RelTag

       •   HTML::Microformats::Format::species

       •   HTML::Microformats::Format::VoteLinks

       •   HTML::Microformats::Format::XFN

       •   HTML::Microformats::Format::XMDP

       •   HTML::Microformats::Format::XOXO

       Similar modules: RDF::RDFa::Parser, HTML::HTML5::Microdata::Parser,
       XML::Atom::Microformats, Text::Microformat, Data::Microformats.

       Related web sites: <http://microformats.org/>, <http://www.perlrdf.org/>.

AUTHOR

       Toby Inkster <tobyink@cpan.org>.

COPYRIGHT AND LICENCE

       Copyright 2008-2012 Toby Inkster

       This library is free software; you can redistribute it and/or modify it under the same
       terms as Perl itself.

DISCLAIMER OF WARRANTIES

       THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING,
       WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR
       PURPOSE.