Provided by: libcatmandu-oai-perl_0.20-1_all 

NAME
Catmandu::Importer::OAI - Package that imports OAI-PMH feeds
SYNOPSIS
# From the command line
# Harvest records
$ catmandu convert OAI --url http://myrepo.org/oai
$ catmandu convert OAI --url http://myrepo.org/oai --metadataPrefix didl --handler raw
# Harvest repository description
$ catmandu convert OAI --url http://myrepo.org/oai --identify 1
# Harvest identifiers
$ catmandu convert OAI --url http://myrepo.org/oai --listIdentifiers 1
# Harvest sets
$ catmandu convert OAI --url http://myrepo.org/oai --listSets 1
# Harvest metadataFormats
$ catmandu convert OAI --url http://myrepo.org/oai --listMetadataFormats 1
# Harvest one record
$ catmandu convert OAI --url http://myrepo.org/oai --getRecord 1 --identifier oai:myrepo:1234
DESCRIPTION
Catmandu::Importer::OAI is an Catmandu importer to harvest metadata records from an OAI-PMH endpoint.
CONFIGURATION
url OAI-PMH Base URL.
metadataPrefix
Metadata prefix to specify the metadata format. Set to "oai_dc" by default.
handler( sub {} | $object | 'NAME' | '+NAME' )
Handler to transform each record from XML DOM (XML::LibXML::Element) into Perl hash.
Handlers can be provided as function reference, an instance of a Perl package that implements
'parse', or by a package NAME. Package names should be prepended by "+" or prefixed with
"Catmandu::Importer::OAI::Parser". E.g "foobar" will create a
"Catmandu::Importer::OAI::Parser::foobar" instance.
By default the handler Catmandu::Importer::OAI::Parser::oai_dc is used for metadataPrefix "oai_dc",
Catmandu::Importer::OAI::Parser::marcxml for "marcxml", Catmandu::Importer::OAI::Parser::mods for
"mods", and Catmandu::Importer::OAI::Parser::struct for other formats. In addition there is
Catmandu::Importer::OAI::Parser::raw to return the XML as it is.
identifier
Option return only results for this particular identifier
set An optional set for selective harvesting.
from
An optional datetime value (YYYY-MM-DD or YYYY-MM-DDThh:mm:ssZ) as lower bound for datestamp-based
selective harvesting.
until
An optional datetime value (YYYY-MM-DD or YYYY-MM-DDThh:mm:ssZ) as upper bound for datestamp-based
selective harvesting.
identify
Harvest the repository description instead of all records.
getRecord
Harvest one record instead of all records.
listIdentifiers
Harvest identifiers instead of full records.
listRecords
Harvest full records. Default operation.
listSets
Harvest sets instead of records.
listMetadataFormats
Harvest metadata formats of records
resumptionToken
An optional resumptionToken to start harvesting from.
dry Don't do any HTTP requests but return URLs that data would be queried from.
strict
Optional validate all parameters first against the OAI 2 specifications before sending it to an OAI
server. Default: undef.
xslt
Preprocess XML records with XSLT script(s) given as comma separated list or array reference. Requires
Catmandu::XML.
max_retries
When an oai request fails, the importer will retry this number of times. Set to '0' by default.
Internally the exponential backoff algorithm is used for this. This means that after every failed
request the importer will choose a random number between 0 and 2^collision (excluded), and wait that
number of seconds. So the actual amount of time before the importer stops can differ:
first retry:
wait [ 0..2^1 [ seconds
second retry:
wait [ 0..2^2 [ seconds
third retry:
wait [ 0..2^3 [ seconds
..
sleep
Sleep a number of seconds between OAI-PMH calls to the endpoint (default 0).
realm
An optional realm value. This value is used when the importer harvests from a repository which is
secured with basic authentication through Integrated Windows Authentication (NTLM or Kerberos).
username
An optional username value. This value is used when the importer harvests from a repository which is
secured with basic authentication.
password
An optional password value. This value is used when the importer harvests from a repository which is
secured with basic authentication.
METHOD
Every Catmandu::Importer is a Catmandu::Iterable all its methods are inherited. The
Catmandu::Importer::OAI methods are not idempotent: OAI-PMH feeds can only be read once.
In addition to methods inherited from Catmandu::Iterable, this module provides the following public
methods:
handle_record( $dom )
Process an XML DOM as with xslt and handler as configured and return the result.
ENVIRONMENT
If you are connected to the internet via a proxy server you need to set the coordinates to this proxy in
your environment:
export http_proxy="http://localhost:8080"
If you are connecting to a HTTPS server and don't want to verify the validity of certificates of the peer
you can set the PERL_LWP_SSL_VERIFY_HOSTNAME to false in your environment. This maybe required to connect
to broken SSL servers:
export PERL_LWP_SSL_VERIFY_HOSTNAME=0
SEE ALSO
Catmandu , Catmandu::Importer
AUTHOR
Nicolas Steenlant, "<nicolas.steenlant at ugent.be>"
CONTRIBUTOR
Patrick Hochstenbach, "<patrick.hochstenbach at ugent.be>"
Jakob Voss, "<nichtich at cpan.org>"
Nicolas Franck, "<nicolas.franck at ugent.be>"
LICENSE AND COPYRIGHT
Copyright 2016 Ghent University Library
This program is free software; you can redistribute it and/or modify it under the terms of either: the
GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.
perl v5.36.0 2023-10-26 Catmandu::Importer::OAI(3pm)