Provided by: libnet-z3950-simple2zoom-perl_1.04-1.1_all bug

NAME

       Net::Z3950::Simple2ZOOM::Config - configuration file for the Simple2ZOOM gateway

SYNOPSIS

        <client>
          <authentication>http://some.url/{user}/pwd={pass}<authentication>
          <database name="srubooks">
            <zurl>http://z3950.loc.gov:7090/voyager</zurl>
            <option name="sru">get</option>
            <charset>marc-8</charset>
            <search>
              <querytype>cql</querytype>
              <map use="4"><index>title</index></map>
              <map use="1003"><index>creator</index></map>
            </search>
          </database>
        </client>

DESCRIPTION

       The universal Swiss Army Gateway "simple2zoom" is configured by a single file, named on
       the command-line, and expressed in XML.  This file specifies which back-end databases are
       supported, how the back-ends are contacted, what character-sets they provide records in,
       and how to map Z39.50 searches to CQL.

       The structure of the file is pretty simple.

   Top Level
       <client>
           The top-level element is <client>.  It contains a single optional <authentication>
           element, any number of <database> elements and a single optional <search> element.
           The second of these specifies how to interpret requests to search in the configured
           databases; the last provides query mapping specifications for dynamically specified
           databases.

       <authentication>
           This element contains a URL template, specifying the address of an HTTP authentication
           server.  The template must include the special strings "{user}" and "{pass}", which
           are substituted with the username and password supplied in the Init request, if any.
           The resulting URL is actioned and the result examined: any successful response (HTTP
           status 200) indicates that the username/password combination is acceptable, and that
           the session can continue; any other response (e.g. 401 Authorization Required) results
           in the Init request being refused with BIB-1 diagnostic 1014 (Init/AC: Authentication
           System error).

           If the <authentication> element is omitted from the configuration, no authentication
           credentials are required, and any that are provided are ignored.

           (A trivial example of an authentication server script is included in the Simple2ZOOM
           distribution, as "etc/sru-auth".)

       <database>
           The <database> element carries a "name" attribute specifying the Z39.50 database name
           by which is it is known to clients.  It contains several complex elements, and is
           discussed in more detail below.

       <search>
           Each <search> element, whether contained within a specific <database> (see below) or
           at the top level, consists of a single mandatory <querytype> element followed by any
           number of <map>s.  The content of <querytype> indicates the type of query that should
           be sent to the back-end server, with Simple2ZOOM reposible for translating incoming
           queries as required into that format.  At present, the only supported value is "cql".

       <map>
           Each <map> element carries a "use" attribute, which is the numeric value of BIB-1 use
           attribute to be supported, and optionally contains a single <index> element which in
           turn contains the name of the corresponding CQL index.  Type-1 searches against the
           specified BIB-1 access point are mapped to CQL searches against the specified index.

           If the <index> is omitted within a <map>, then the generated CQL query term has no
           index specified.  This can be useful for BIB-1 attributes such as 1016 (any) and 1035
           (anywhere).

   Databases
       The <database> element which describes each database contains the following elements in
       the specified order.

       In general, <database> entries are of two kinds: those connecting through to a Z39.50
       database will have no <search> element, since no query mapping is necessary to translate
       an incoming Type-1 query; but those connecting to an SRU or SRW database will have a
       <search> element with <querytype> set to "cql" and containing information on how to map
       from specified BIB-1 use attributes to CQL indexes.

       <zurl>
           Contains the target address of the back-end database (e.g.
           "tcp:z3950.indexdata.com/gils" or "http://z3950.loc.gov:7090/voyager").

       <resultsetid>
           Optional.  If provided, it must take one of the following values, and if it is omitted
           then the value "fallback" is assumed:

           "id"
               When queries are received that include references to existing result-sets, these
               are translated into result-set references using the "cql.resultSetId" index.  It
               is an error if the server does not support this facility.

           "search"
               References to existing result-sets are rewritten as resubmissions of the query.
               This works on all servers, but does not reliably give precisely correct results if
               the database is updated between searches.

           "fallback"
               Result-set references are used when supported, but resubmissions of prior queries
               are used when this facility is unavailable.

       <nonamedresultsets>
           This is optional.  If provided, it is empty and indicates that the back-end database
           does not support named result sets.

       <option>
           There may be any number of these.  Each <option> element carries a "name" attribute
           and contains a corresponding value.  These are ZOOM options which are applied to the
           connection when it is first created, and can be used to control, for example, the
           desired "elementSetName" or "schema" of the records provided by the back-end.  A
           particularly important option is "sru", which may be set to "get", "post" or "soap" to
           request vanilla SRU, SRU over POST and SRW respectively.

       <charset>
           Optional.  Contains the name of the character-set in which the back-end target
           supplies records (e.g. "marc-8")

       <search>
           Optional.  Provides specifications for how to search the database, exactly like the
           top-level <search> element described above.

       <schema>
           Optional and repeatable: each element indicates special handling for when records are
           requested in a particular schema.  See below.

       <sutrs-record>, <usmarc-record>, <grs1-record>.
           Optional.  Provides specifications for how to construct records in the relevant
           syntaxes when they are requested by clients.

           The format is the same in all cases: the specification contains a list of <field>
           elements, each of which has an "xpath" attribute and textual content.  Records are
           built by accessing the data addressed by the specified XPath expressions, and encoding
           each as an element addressed as specified by the element content.  The interpretation
           of the content is different for different record-syntaxes:

           SUTRS
               The content is ignored.

           USMARC
               The content indicates a MARC field by a string consisting of the following parts,
               in order: a three-digit field number; optionally a slash followed by the first
               indicator; optionally another slash followed by the second indicator; optionally a
               dollar sign followed by a subfield tag.  In other words, MARC field specifications
               much match the regular expression "/^\d\d\d(/w)?(/w?)(\$\w)?$/".  It is impossible
               to specify the second indicator without the first, but a subfield may be specified
               along with zero, one or two indicators.

               As usual, a few examples are worth any amount of explanation:

                       001
                       260$c
                       500$a
                       100/1$a
                       245/1/0$a

           GRS-1
               The content indicates an address within a GRS-1 record in the form of one or more
               consecutive (type,value) pairs, each enclosed in parentheses.  For example,
               "(1,14)" would indicate an element of type 1 (tagSet-M) with value 14
               (localControlNumber).  A longer path such as "(3,admin)(2,6)" indicates an
               abstract field (tagSet-G element 6) within an "admin" sub-record.

   Schemas
       Each <schema> element is empty, but carries the following attributes, which are used to
       provide records to Z39.50 clients in MARC formats.

       oid Mandatory.  This is the OID of a Z39.50 record-syntax which is to be handled by schema
           mapping.  Requests in this database for this record-syntax are handled as specified.
           Example value: 1.2.840.10003.5.10

       sru Mandatory.  This is the URI of an SRU/W schema which is requested from the SRU or SRW
           back-end in order to fulfill the request.  Example value:
           "info:srw/schema/1/marcxml-v1.1"

       format
           Optional.  Indicates which of the MARC variants is in use, so that the record can be
           formatted correctly.  Defaults to "MARC21" if omitted.  Example values: "MARC21",
           "USMARC", "UNIMARC"

       encoding
           Optional.  Indicates which character-set to use for the formatted record.  Defaults to
           "UTF-8" if omitted.  Example values: "UTF-8", <MARC-8>

       NOTE that in its current form this schema-mapping only works for the specific though
       common combination of Z39.50 front-end, SRU back-end and MARC record syntax.

CONFIGURATION FILE SCHEMA

       The Simple2ZOOM distribution includes, in the "etc" directory, an XML schema which can be
       used to validate configuration files.  This schema is provided in four formats:

       simple2zoom.rnc
           Relax-NG compact format: a simple, elegant, terse and wholly comprehensible XML
           constraint language that you don't even need to learn in order to understand.  This is
           the master version: the others are automatically generated from it.

       simple2zoom.rng
           Relax-NG XML format: the world seems to have this zany fetish that everything must be
           specified in XML, so Relax-NG has an XML syntax that corresponds trivially with the
           much nicer compact syntax.  The principle value of this is that "xmllint" understands
           it.

       simple2zoom.dtd
           An old-fashioned DTD (document type definition).

       simple2zoom.xsd
           If you must.

       Use whichever you like.  For example,

        xmllint --relaxng simple2zoom.rng --noout test.xml
        xmllint --dtdvalid simple2zoom.dtd --noout test.xml
        xmllint --schema simple2zoom.xsd --noout test.xml

SEE ALSO

       The "simple2zoom" program.

       The "Net::Z3950::Simple2ZOOM" module.

AUTHOR

       Mike Taylor <mike@indexdata.com>

COPYRIGHT AND LICENCE

       Copyright (C) 2007 by Index Data.

       This library is free software; you can redistribute it and/or modify it under the same
       terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of
       Perl 5 you may have available.