Provided by: libtm-perl_1.56-7_all bug

NAME

       TM::Tau - Topic Maps, Tau Expressions

SYNOPSIS

         use TM::Tau;
         # read a map from an XTM file
         $tm = new TM::Tau ('test.xtm');        # or
         $tm = new TM::Tau ('file:test.xtm');   # or
         $tm = new TM::Tau ('file:test.xtm >'); # or
         $tm = new TM::Tau ('file:test.xtm > null:');

         # read it now and write it back to the file when object goes out of scope
         $tm = new TM::Tau ('test.xtm > test.xtm');

         # create empty map at start and then let it automatically flush onto file
         $tm = new TM::Tau ('null: > test.xtm'); # or
         $tm = new TM::Tau ('> test.xtm');

         # read-in at the start (i.e. constructor time) and then flush it back
         $tm = new TM::Tau ('> test.xtm >');

         # load and merge maps at constructor time
         $tm = new TM::Tau ('file:test.xtm + http://..../test.atm');

         # load map and filter it with a constraint at constructor time
         $tm = new TM::Tau ('mymap.atm * myontology.ont');

         # convert between different formats
         $tm = new TM::Tau ('test.xtm > test.atm');

DESCRIPTION

       When you need to make maps persistent, then you can resort to either using the
       prefabricated packages TM::Materialized::*, or you can build your own persistent forms
       using any of the available synchronizable traits.  In either case your application will
       have to invoke methods like "sync_in" and "sync_out" to copy content from the resource
       into memory and back.

       While this gives you great flexibility, in some cases your needs may be much simpler:

       consumer model:
           A map should be sourced into memory when the map object is created.

           A typical use case is a web server application which accesses the map on disk with
           every request and which returns parts of the map to an HTTP client.

       producer model:
           A map is created first in memory and is flushed onto disk at destruction time.

           One example here is a script which extracts content from a relational database, puts
           it into a map in memory. At the end all map content is copied onto disk.

       maintainer model:
           A map is sourced from the disk at map object creation time, you update it and it will
           be flushed back to the same disk location at object destruction.

           Your application may be started with with new content to be put into an existing map.
           So first the map will be loaded, the new content added, and after that the map will be
           written back from where it came.

       translator model:
           A map is sourced from the disk, is translated into some other representation and is
           written back to disk to another location or format.

           As an example, you might want to convert between XTM and CTM format.

       filter model:
           A map is sourced from some backend, is transformed and/or filtered before being used.

           Your application could be one which only needs a particular portion of the map. So
           before processing the map is filtered down to the necessary parts.

       integration model:
           One or more maps are sourced from backends and are merged before processing.

           If you want to provide a consolidated view over several different data resources, you
           could first bring them all into topic map form, and then merge them before handing it
           to the application.

       What is common to all these cases is that there is a breath-in phase when the map object
       is constructed, and a breath-out phase when it is destroyed. In between theses phases the
       map object is just a normal instance of TM.

TAU EXPRESSIONS

   Overview
       To control what happens in these two phases, this package provides a simple expression
       language, call Tau. With it you can control

       •   where maps are supposed to come from, or go to,

           Here the language provides a URI mechanism for addressing, such as

              file:tm.atm

           or

              http://topicmaps/some/map.xtm

       •   when (or how) they should be merged,

           To merge two (manifested or virtual) topic maps together the "+" operator can be used

              file:tm.atm + http://topicmaps/some/map.xtm

       •   when (or how) they should be transformed,

           To transform product data to only something a customer is supposed to see, the "*" can
           be used:

              product_data.atm * file:customer_view.tmql

       •   when (or whether at all) they should be loaded oder saved

       NOTE: Later versions of this package will heavily overload the operators to also operate
       on other objects.

   Syntax
       The Tau expression language supports two binary operators, "+" and "*". The "+" operator
       intuitively puts things together, the "*" applies the right-hand operand to the left-hand
       operand and behaves as a transformer or a filter. The exact semantics depends on the
       operands. In any case, the "*" binds stronger than the "+", and that precedence order can
       be overridden with parentheses.

       The parser understands the following syntax for Tau expression:

          tau_expr    -> mul_expr

          mul_expr    -> source { ('>' | '*') filter }

          source      -> '(' add_expr ')' | primitive

          add_expr    -> mul_expr { '+' mul_expr }

          filter      -> '(' filter ')' | primitive

          primitive   -> uri [ module_spec ]

          module_spec -> '{' name '}'

       Terms in quotes are terminals, terms inside {} can appear any number of times (also zero),
       terms inside [] are optional. All other terms are non-terminals.

       NOTE: Filters are planned to be composite, hence the optional bracketing in the grammar.

       The (pre)parser supports the following shortcuts (I hate unnecessary typing):

       •   "whatever" is interpreted as "(whatever) > -"

       •   "whatever >" is interpreted as "(whatever) > -"

       •   "> whatever" is interpreted as  "- > (whatever)"

       •   "< whatever >" is interpreted as "whatever > whatever", sync_in => 0

       •   "> whatever <" is interpreted as "whatever > whatever", sync_out => 0

       •   "> whatever >" is interpreted as "whatever > whatever"

       •   "< whatever <" is interpreted as "whatever > whatever", sync_in => 0, sync_out => 0

       •   The URI "-" as source is interpreted as STDIN (via the TM::Serializable::AsTMa trait).
           Unless you override that.

       •   The URI "-" as filter is interpreted as STDOUT (via the TM::Serializable::Dumper
           trait).  Unless you override that.

   Examples
         # memory-only map
         null: > null:

         # read at startup, sync out when map goes out of scope
         file:test.atm > file:test.atm

         # copy AsTMa= to XTM
         file:test.atm > file:test.xtm

         # using a dedicated driver to load a map, store it onto a file
         dns:my.dns.server { My::DNS::Driver } > file:dns_snapshot.atm
         # this will only work if the My::DNS::Driver supports to materialize
         # the whole map

         # read a map and compute the statistics
         file:test.atm * http://psi.tm.bond.edu.au/queries/1.0/statistics

   Map Source URLs
       URIs are used to address maps. An XTM map, for example, stored in the file system might be
       addressed as

         file:mydir/somemap.xtm

       for a relative URL (relative to an application's current working directory), or via an
       absolute URI such as

         http://myserver/somemap.atm

       The package supports all those access methods (file:, http:, ...) which LWP supports.

   Drivers
       Obviously a different deserializer package has to be used for an XTM file than for an
       AsTMa or LTM file. Some topic map content may be in a TM backend database, some content
       may only exist virtually, being emulated by a dedicated package.  While you may be mostly
       fine with system defaults, in some cases you may want to have precise control on how files
       and other external sources are to be interpreted. By their nature, drivers for sources
       must be subclasses of TM.

       A similar consideration applies to filters. Also here the specified URI determines which
       filter actually has to be applied. It also can define where the content eventually is
       stored to. Drivers for filters must be either subclasses of TM::Tau::Filter, or
       alternatively must be a trait providing a method "sync_out".

   Binding by Schemes (implicit)
       When a Tau expression is parsed, the parser tries to identify which driver to use for
       which part of that composite map denoted by the expression. For this purpose a pattern
       matching approach is used to map regular expression patterns to driver package names. If
       you would like to learn about the current state of affairs do a

          use Data::Dumper;
          print Dumper \%TM::Tau::sources;
          print Dumper \%TM::Tau::filters;

       Obviously, there is a distinction made between the namespace of resources (residing data)
       and filters (and transformers).

       Each entry in any of the hashes contains as key a regular expression and as value the name
       of the driver to be used. That key is matched against the parsed URI and the first match
       wins. Since the keys in a hash are not naturally ordered, that is undefined.

       At any time you can override values there:

          $TM::Tau::sources{'null:'}          = 'TM';
          $TM::Tau::sources{'tm:server\.com'} = 'My::Private::TopicMap::Driver';

       or delete existing ones. The only constraint is that the driver package must already be
       "require"d into your Perl program.

       During parsing of a Tau expression, two cases are distinguished:

       •   If the URI specifies a source, then this URI will be matched against the regexps in
           the "TM::Tau::sources" hash. The value of that entry will be used as class name to
           instantiate an object whereby one component ("uri") will be passed as parameter like
           this:

           $this_class_name->new (uri => $this_uri, baseuri => $this_uri)

           This class should be a subclass of TM.

       •   If the URI specifies a filter, then you have two options: Either you use as entry the
           name of a subclass of TM::Tau::Filter. Then an object is created like above.
           Alternatively, the entry is a list reference containing names of traits. Then a
           generic TM::Tau::Filter node is generated first and each of the traits are applied
           like this:

           Class::Trait->apply ( $node => $trait => {
                                          exclude => [ 'mtime',
                                                       'sync_out',
                                                       'source_in' ]
                                          } );

       If there is no match, this results in an exception.

   Binding by Package Pragmas (Explicit)
       Another way to define which package should be used for a particular map is to specify this
       directly in the tau expression:

          http://.../map.xtm { My::BrokenXTM }

       In this case the resource is loaded and is processed using "My::BrokenXTM" as package to
       parse it (see TM::Materialized::Stream on how to write such a driver).

INTERFACE

   Constructor
       The constructor accepts a string following the Tau expression "Syntax".  If that string is
       missing, "null:" will be assumed. An appropriate exception will be raised if the syntax is
       violated or one of the mentioned drivers is not preloaded.

       Examples:

          # map only existing in memory
          my $map = new TM::Tau;

          # map will be loaded as result of this tau expression
          my $map = new TM::Tau ('file:music.atm * file:beatles.tmql');

       Apart from the Tau expression the constructor optionally interprets a hash with the
       following keys:

       "sync_in" (default: 1)
           If non-zero, in-synchronisation at constructor time will happen, otherwise it is
           suppressed. In that case you can trigger in-synchronisation explicitly with the method
           "sync_in".

       "sync_out" (default: 1)
           If non-zero, out-synchronisation at destruction time will happen, otherwise it is
           suppressed.

       Example:

          my $map = new TM::Tau ('test.xtm',
                                 sync_in => 0); # dont want to let it happen now
          ....                                  # time passes
          $map->sync_in;                        # but now is a good time

SEE ALSO

       TM, TM::Tau::Filter

AUTHOR

       Copyright 200[0-68], Robert Barta <drrho@cpan.org>, All rights reserved.

       This library is free software; you can redistribute it and/or modify it under the same
       terms as Perl itself.  http://www.perl.com/perl/misc/Artistic.html