Provided by: libmarc-lint-perl_1.50-1_all bug

NAME

       MARC::Lint - Perl extension for checking validity of MARC records

SYNOPSIS

           use MARC::File::USMARC;
           use MARC::Lint;

           my $lint = new MARC::Lint;
           my $filename = shift;

           my $file = MARC::File::USMARC->in( $filename );
           while ( my $marc = $file->next() ) {
               $lint->check_record( $marc );

               # Print the title tag
               print $marc->title, "\n";

               # Print the errors that were found
               print join( "\n", $lint->warnings ), "\n";
           } # while

       Given the following MARC record:

           LDR 00000nam  22002538a 4500
           040    _aMdSSJTT
                  _cMdSSJTT
           040    _aMdSSJTT
                  _beng
                  _cMdSSJTT
           100 14 _aWall, Larry.
           110 1  _aO'Reilly & Associates.
           245 90 _aProgramming Perl /
                  _aBig Book of Perl /
                  _cLarry Wall, Tom Christiansen & Jon Orwant.
           250    _a3rd ed.
           250    _a3rd ed.
           260    _aCambridge, Mass. :
                  _bO'Reilly,
                  _r2000.
           590 4  _aPersonally signed by Larry.
           856 43 _uhttp://www.perl.com/

       the following errors are generated:

           1XX: Only one 1XX tag is allowed, but I found 2 of them.
           100: Indicator 2 must be blank but it's "4"
           245: Indicator 1 must be 0 or 1 but it's "9"
           245: Subfield _a is not repeatable.
           040: Field is not repeatable.
           260: Subfield _r is not allowed.
           856: Indicator 2 must be blank, 0, 1, 2 or 8 but it's "3"

DESCRIPTION

       Module for checking validity of MARC records.  99% of the users will want to do something
       like is shown in the synopsis.  The other intrepid 1% will overload the "MARC::Lint"
       module's methods and provide their own special field-level checking.

       What this means is that if you have certain requirements, such as making sure that all 952
       tags have a certain call number in them, you can write a function that checks for that,
       and still get all the benefits of the MARC::Lint framework.

EXPORT

       None.  Everything is done through objects.

METHODS

   new()
       No parms needed.  The "MARC::Lint" object is little more than a list of warnings and a
       bunch of rules.

   warnings()
       Returns a list of warnings found by "check_record()" and its brethren.

   clear_warnings()
       Clear the list of warnings for this linter object.  It's automatically called when you
       call "check_record()".

   warn( $str [, $str...] )
       Create a warning message, built from strings passed, like a "print" statement.

       Typically, you'll leave this to "check_record()", but industrious programmers may want to
       do their own checking as well.

   check_record( $marc )
       Does all sorts of lint-like checks on the MARC record $marc, both on the record as a
       whole, and on the individual fields & subfields.

   check_xxx( $field )
       Various functions to check the different fields.  If the function doesn't exist, then it
       doesn't get checked.

   check_020()
       Looks at 020$a and reports errors if the check digit is wrong.  Looks at 020$z and
       validates number if hyphens are present.

       Uses Business::ISBN to do validation. Thirteen digit checking is currently done with the
       internal sub _isbn13_check_digit(), based on code from Business::ISBN.

       TO DO (check_020):

        Fix 13-digit ISBN checking.

   _isbn13_check_digit($ean)
       Internal sub to determine if 13-digit ISBN has a valid checksum. The code is taken from
       Business::ISBN::as_ean. It is expected to be temporary until Business::ISBN is updated to
       check 13-digit ISBNs itself.

   check_041( $field )
       Warns if subfields are not evenly divisible by 3 unless second indicator is 7 (future
       implementation would ensure that each subfield is exactly 3 characters unless ind2 is
       7--since subfields are now repeatable. This is not implemented here due to the large
       number of records needing to be corrected.). Validates against the MARC Code List for
       Languages (<http://www.loc.gov/marc/>) using the MARC::Lint::CodeData data pack to
       MARC::Lint (%LanguageCodes, %ObsoleteLanguageCodes).

   check_043( $field )
       Warns if each subfield a is not exactly 7 characters. Validates each code against the MARC
       code list for Geographic Areas (<http://www.loc.gov/marc/>) using the MARC::Lint::CodeData
       data pack to MARC::Lint (%GeogAreaCodes, %ObsoleteGeogAreaCodes).

   check_245( $field )
        -Makes sure $a exists (and is first subfield).
        -Warns if last character of field is not a period
        --Follows LCRI 1.0C, Nov. 2003 rather than MARC21 rule
        -Verifies that $c is preceded by / (space-/)
        -Verifies that initials in $c are not spaced
        -Verifies that $b is preceded by :;= (space-colon, space-semicolon, space-equals)
        -Verifies that $h is not preceded by space unless it is dash-space
        -Verifies that data of $h is enclosed in square brackets
        -Verifies that $n is preceded by . (period)
         --As part of that, looks for no-space period, or dash-space-period (for replaced elipses)
        -Verifies that $p is preceded by , (no-space-comma) when following $n and . (period) when following other subfields.
        -Performs rudimentary article check of 245 2nd indicator vs. 1st word of 245$a (for manual verification).

        Article checking is done by internal _check_article method, which should work for 130, 240, 245, 440, 630, 730, and 830.

   _check_article
       Check of articles is based on code from Ian Hamilton. This version is more limited in that
       it focuses on English, Spanish, French, Italian and German articles. Certain possible
       articles have been removed if they are valid English non-articles. This version also
       disregards 008_language/041 codes and just uses the list of articles to provide
       warnings/suggestions.

       source for articles = <http://www.loc.gov/marc/bibliographic/bdapp-e.html>

       Should work with fields 130, 240, 245, 440, 630, 730, and 830. Reports error if another
       field is passed in.

SEE ALSO

       Check the docs for MARC::Record.  All software links are there.

TODO

       •   Subfield 6

           For subfield 6, it should always be the 1st subfield according to MARC 21
           specifications. Perhaps a generic check should be added that warns if subfield 6 is
           not the 1st subfield.

       •   Subfield 8.

           This subfield could be the 1st or 2nd subfield, so the code that checks for the 1st
           few subfields (check_245, check_250) should take that into account.

       •   Subfield 9

           This subfield is not officially allowed in MARC, since it is locally defined. Some way
           needs to be made to allow messages/warnings about this subfield to be turned off (or
           otherwise deal with records using/allowing locally defined subfield 9).

       •   008 length and presence check

           Currently, 008 validation is not implemented in MARC::Lint, but is left to
           MARC::Errorchecks. It might be useful if MARC::Lint's basic validation checks included
           a verification that the 008 exists and is exactly 40 characters long. Additional
           008-related checking and byte validation would remain in MARC::Errorchecks.

       •   ISBN and ISSN checking

           020 and 022 fields are validated with the "Business::ISBN" and "Business::ISSN"
           modules, respectively. Business::ISBN versions between 2 and 2.02_01 are incompatible
           with MARC::Lint.

       •   check_041 cleanup

           Splitting subfield code strings every 3 chars could probably be written more
           efficiently.

       •   check_245 cleanup

           The article checking in particular.

       •   Method for turning off checks

           Provide a way for users to skip checks more easily when using check_record, or a
           specific check_xxx method (e.g. skip article checking).

LICENSE

       This code may be distributed under the same terms as Perl itself.

       Please note that these modules are not products of or supported by the employers of the
       various contributors to the code.