bionic (3) Catmandu::Importer.3pm.gz

Provided by: libcatmandu-perl_1.0700-1_all bug


       Catmandu::Importer - Namespace for packages that can import


           # From the command line

           # JSON is an importer and YAML an exporter
           $ catmandu convert JSON to YAML < data.json

           # OAI is an importer and JSON an exporter
           $ catmandu convert OAI --url to JSON

           # Fetch remote content
           $ catmandu convert JSON --file to YAML

           # From Perl

           use Catmandu;
           use Data::Dumper;

           my $importer = Catmandu->importer('JSON', file => 'data.json');

           $importer->each(sub {
               my $item = shift;
               print Dumper($item);

           my $num = $importer->count;

           my $first_item = $importer->first;

           # Convert OAI to JSON in Perl
           my $importer = Catmandu->importer('OAI', url => '');
           my $exporter = Catmandu->exporter('JSON');



       A Catmandu::Importer is a Perl package that can generate structured data from sources such as JSON, YAML,
       XML, RDF or network protocols such as Atom, OAI-PMH, SRU and even DBI databases. Given an
       Catmandu::Importer a programmer can read data from using one of the many Catmandu::Iterable methods:


       Every Catmandu::Importer is also Catmandu::Fixable and thus inherits a 'fix' parameter that can be set in
       the constructor. When given a 'fix' parameter, then each item returned by the generator will be
       automatically Fixed using one or more Catmandu::Fixes.  E.g.

           my $importer = Catmandu->importer('JSON',fix => ['upcase(title)']);
           $importer->each( sub {
               my $item = shift ; # Every $item->{title} is now upcased...


           # or via a Fix file
           my $importer = Catmandu->importer('JSON',fix => ['/my/fixes.txt']);
           $importer->each( sub {
               my $item = shift ; # Every $item->{title} is now upcased...



           Read input from a local file given by its path. If the path looks like a url, the content will be
           fetched first and then passed to the importer.  Alternatively a scalar reference can be passed to
           read from a string.

       fh  Read input from an IO::Handle. If not specified, Catmandu::Util::io is used to create the input
           stream from the "file" argument or by using STDIN.

           Binmode of the input stream "fh". Set to ":utf8" by default.

       fix An ARRAY of one or more Fix-es or Fix scripts to be applied to imported items.

           The data at "data_path" is imported instead of the original data.

              # given this imported item:
              {abc => [{a=>1},{b=>2},{c=>3}]}
              # with data_path 'abc', this item gets imported instead:
              # with data_path 'abc.*', 3 items get imported:

           Variables given here will interpolate the "file" and "http_body" options. The syntax is the same as

               # named arguments
               my $importer = Catmandu->importer('JSON',
                   file => 'http://{server}/{path}',
                   variables => {server => '', path => 'file.json'},

               # positional arguments
               my $importer = Catmandu->importer('JSON',
                   file => 'http://{server}/{path}',
                   variables => ',file.json',

               # or
               my $importer = Catmandu->importer('JSON',
                   url => 'http://{server}/{path}',
                   variables => ['','file.json'],

               # or via the command line
               $ catmandu convert JSON --file 'http://{server}/{path}' --variables ',file.json'


       These options are only relevant if "file" is a url. See LWP::UserAgent for details about these options.

           Set the GET/POST message body.

           Set the type of HTTP request 'GET', 'POST' , ...

           A reference to a HTTP::Headers objects.

   Set an own HTTP client
           Set an own HTTP client

   Alternative set the parameters of the default client
           A string containing the name of the HTTP client.

           Maximum number of HTTP redirects allowed.

           Maximum execution time.

           Verify the SSL certificate.

           Maximum times to retry the HTTP request if it temporarily fails. Default is not to retry.  See
           LWP::User::UserAgent::Determined for the HTTP status codes that initiate a retry.

           Maximum times and timeouts to retry the HTTP request if it temporarily fails. Default is not to
           retry.  See LWP::User::UserAgent::Determined for the HTTP status codes that initiate a retry and the
           format of the timing value.


   first, each, rest , ...
       See Catmandu::Iterable for all inherited methods.


       Create your own importer by creating a Perl package in the Catmandu::Importer namespace that implements
       "Catmandu::Importer". Basically, you need to create a method 'generate' which returns a callback that
       creates one Perl hash for each call:

           my $importer = Catmandu::Importer::Hello->new;

           $importer->generate(); # record
           $importer->generate(); # next record
           $importer->generate(); # undef = end of stream

       Here is an example of a simple "Hello" importer:

           package Catmandu::Importer::Hello;

           use Catmandu::Sane;
           use Moo;

           with 'Catmandu::Importer';

           sub generator {
               my ($self) = @_;
               state $fh = $self->fh;
               my $n = 0;
               return sub {
                   $self->log->debug("generating record " . ++$n);
                   my $name = $self->fh->readline;
                   return defined $name ? { "hello" => $name } : undef;


       This importer can be called via the command line as:

           $ catmandu convert Hello to JSON < /tmp/names.txt
           $ catmandu convert Hello to YAML < /tmp/names.txt
           $ catmandu import Hello to MongoDB --database_name test < /tmp/names.txt

       Or, via Perl

           use Catmandu;

           my $importer = Catmandu->importer('Hello', file => '/tmp/names.txt');
           $importer->each(sub {
               my $items = shift;


       Catmandu::Iterable , Catmandu::Fix , Catmandu::Importer::CSV, Catmandu::Importer::JSON ,