oracular (3) Biblio::Isis.3pm.gz

Provided by: libbiblio-isis-perl_0.24-1.5_all bug

NAME

       Biblio::Isis - Read CDS/ISIS, WinISIS and IsisMarc database

SYNOPSIS

         use Biblio::Isis;

         my $isis = new Biblio::Isis(
               isisdb => './cds/cds',
         );

         for(my $mfn = 1; $mfn <= $isis->count; $mfn++) {
               print $isis->to_ascii($mfn),"\n";
         }

DESCRIPTION

       This module will read ISIS databases created by DOS CDS/ISIS, WinIsis or IsisMarc. It can
       be used as perl-only alternative to OpenIsis module which seems to depriciate it's old
       "XS" bindings for perl.

       It can create hash values from data in ISIS database (using "to_hash"), ASCII dump (using
       "to_ascii") or just hash with field names and packed values (like "^asomething^belse").

       Unique feature of this module is ability to "include_deleted" records.  It will also skip
       zero sized fields (OpenIsis has a bug in XS bindings, so fields which are zero sized will
       be filled with random junk from memory).

       It also has support for identifiers (only if ISIS database is created by IsisMarc), see
       "to_hash".

       This module will always be slower than OpenIsis module which use C library. However, since
       it's written in perl, it's platform independent (so you don't need C compiler), and can be
       easily modified. I hope that it creates data structures which are easier to use than ones
       created by OpenIsis, so reduced time in other parts of the code should compensate for
       slower performance of this module (speed of reading ISIS database is rarely an issue).

METHODS

   new
       Open ISIS database

        my $isis = new Biblio::Isis(
               isisdb => './cds/cds',
               read_fdt => 1,
               include_deleted => 1,
               hash_filter => sub {
                       my ($v,$field_number) = @_;
                       $v =~ s#foo#bar#g;
               },
               debug => 1,
               join_subfields_with => ' ; ',
        );

       Options are described below:

       isisdb
            This is full or relative path to ISIS database files which include common prefix of
            ".MST", and ".XRF" and optionally ".FDT" (if using "read_fdt" option) files.

            In this example it uses "./cds/cds.MST" and related files.

       read_fdt
            Boolean flag to specify if field definition table should be read. It's off by
            default.

       include_deleted
            Don't skip logically deleted records in ISIS.

       hash_filter
            Filter code ref which will be used before data is converted to hash. It will receive
            two arguments, whole line from current field (in $_[0]) and field number (in $_[1]).

       debug
            Dump a lot of debugging output even at level 1. For even more increase level.

       join_subfields_with
            Define delimiter which will be used to join repeatable subfields. This option is
            included to support lagacy application written against version older than 0.21 of
            this module. By default, it disabled. See "to_hash".

       ignore_empty_subfields
            Remove all empty subfields while reading from ISIS file.

   count
       Return number of records in database

         print $isis->count;

   fetch
       Read record with selected MFN

         my $rec = $isis->fetch(55);

       Returns hash with keys which are field names and values are unpacked values for that field
       like this:

         $rec = {
           '210' => [ '^aNew York^cNew York University press^dcop. 1988' ],
           '990' => [ '2140', '88', 'HAY' ],
         };

   mfn
       Returns current MFN position

         my $mfn = $isis->mfn;

   to_ascii
       Returns ASCII output of record with specified MFN

         print $isis->to_ascii(42);

       This outputs something like this:

         210   ^aNew York^cNew York University press^dcop. 1988
         990   2140
         990   88
         990   HAY

       If "read_fdt" is specified when calling "new" it will display field names from ".FDT" file
       instead of numeric tags.

   to_hash
       Read record with specified MFN and convert it to hash

         my $hash = $isis->to_hash($mfn);

       It has ability to convert characters (using "hash_filter") from ISIS database before
       creating structures enabling character re-mapping or quick fix-up of data.

       This function returns hash which is like this:

         $hash = {
           '210' => [
                      {
                        'c' => 'New York University press',
                        'a' => 'New York',
                        'd' => 'cop. 1988'
                      }
                    ],
           '990' => [
                      '2140',
                      '88',
                      'HAY'
                    ],
         };

       You can later use that hash to produce any output from ISIS data.

       If database is created using IsisMarc, it will also have to special fields which will be
       used for identifiers, "i1" and "i2" like this:

         '200' => [
                    {
                      'i1' => '1',
                      'i2' => ' '
                      'a' => 'Goa',
                      'f' => 'Valdo D\'Arienzo',
                      'e' => 'tipografie e tipografi nel XVI secolo',
                    }
                  ],

       In case there are repeatable subfields in record, this will create following structure:

         '900' => [ {
               'a' => [ 'foo', 'bar', 'baz' ],
         }]

       Or in more complex example of

         902   ^aa1^aa2^aa3^bb1^aa4^bb2^cc1^aa5

       it will create

         902   => [
               { a => ["a1", "a2", "a3", "a4", "a5"], b => ["b1", "b2"], c => "c1" },
         ],

       This behaviour can be changed using "join_subfields_with" option to "new", in which case
       "to_hash" will always create single value for each subfield.  This will change result to:

       This method will also create additional field 000 with MFN.

       There is also more elaborative way to call "to_hash" like this:

         my $hash = $isis->to_hash({
               mfn => 42,
               include_subfields => 1,
         });

       Each option controll creation of hash:

       mfn Specify MFN number of record

       include_subfields
           This option will create additional key in hash called "subfields" which will have
           original record subfield order and index to that subfield like this:

             902   => [ {
                   a => ["a1", "a2", "a3", "a4", "a5"],
                   b => ["b1", "b2"],
                   c => "c1",
                   subfields => ["a", 0, "a", 1, "a", 2, "b", 0, "a", 3, "b", 1, "c", 0, "a", 4],
             } ],

       join_subfields_with
           Define delimiter which will be used to join repeatable subfields. You can specify
           option here instead in "new" if you want to have per-record control.

       hash_filter
           You can override "hash_filter" defined in "new" using this option.

   tag_name
       Return name of selected tag

        print $isis->tag_name('200');

   read_cnt
       Read content of ".CNT" file and return hash containing it.

         print Dumper($isis->read_cnt);

       This function is not used by module (".CNT" files are not required for this module to
       work), but it can be useful to examine your index (while debugging for example).

   unpack_cnt
       Unpack one of two 26 bytes fixed length record in ".CNT" file.

       Here is definition of record:

        off key        description                             size
         0: IDTYPE     BTree type                              s
         2: ORDN       Nodes Order                             s
         4: ORDF       Leafs Order                             s
         6: N          Number of Memory buffers for nodes      s
         8: K          Number of buffers for first level index s
        10: LIV        Current number of Index Levels          s
        12: POSRX      Pointer to Root Record in N0x           l
        16: NMAXPOS    Next Available position in N0x          l
        20: FMAXPOS    Next available position in L0x          l
        24: ABNORMAL   Formal BTree normality indicator        s
        length: 26 bytes

       This will fill $self object under "cnt" with hash. It's used by "read_cnt".

BUGS

       Some parts of CDS/ISIS documentation are not detailed enough to exmplain some variations
       in input databases which has been tested with this module.  When I was in doubt, I assumed
       that OpenIsis's implementation was right (except for obvious bugs).

       However, every effort has been made to test this module with as much databases (and
       programs that create them) as possible.

       I would be very greatful for success or failure reports about usage of this module with
       databases from programs other than WinIsis and IsisMarc. I had tested this against ouput
       of one "isis.dll"-based application, but I don't know any details about it's version.

VERSIONS

       As this is young module, new features are added in subsequent version. It's a good idea to
       specify version when using this module like this:

         use Biblio::Isis 0.23

       Below is list of changes in specific version of module (so you can target older versions
       if you really have to):

       0.24    Added "ignore_empty_subfields"

       0.23    Added "hash_filter" to "to_hash"

               Fixed bug with documented "join_subfields_with" in "new" which wasn't implemented

       0.22    Added field number when calling "hash_filter"

       0.21    Added "join_subfields_with" to "new" and "to_hash".

               Added "include_subfields" to "to_hash".

       0.20    Added "$isis->mfn", support for repeatable subfields and "$isis->to_hash({ mfn =>
               42, ... })" calling convention

AUTHOR

               Dobrica Pavlinusic
               CPAN ID: DPAVLIN
               dpavlin@rot13.org
               http://www.rot13.org/~dpavlin/

       This module is based heavily on code from "LIBISIS.PHP" library to read ISIS files V0.1.1
       written in php and (c) 2000 Franck Martin <franck@sopac.org> and released under LGPL.

       This program is free software; you can redistribute it and/or modify it under the same
       terms as Perl itself.

       The full text of the license can be found in the LICENSE file included with this module.

SEE ALSO

       Biblio::Isis::Manual for CDS/ISIS manual appendix F, G and H which describe file format

       OpenIsis web site <http://www.openisis.org>

       perl4lib site <http://perl4lib.perl.org>