lunar (1) stilts-tmatch1.1.gz

Provided by: stilts_3.4.7-4_all bug

NAME

       stilts-tmatch1 - Performs a crossmatch internal to a single table

SYNOPSIS

       stilts tmatch1 [matcher=<matcher-name>] [params=<match-params>] [tuning=<tuning-params>]
                      [values=<expr-list>] [action=identify|keep0|keep1|wide2|wideN]
                      [progress=none|log|time|profile] [runner=parallel|parallel<n>|parallel-
                      all|sequential|classic|partest] [ifmt=<in-format>] [istream=true|false]
                      [in=<table>] [icmd=<cmds>] [ocmd=<cmds>]
                      [omode=out|meta|stats|count|checksum|cgi|discard|topcat|samp|tosql|gui]
                      [out=<out-table>] [ofmt=<out-format>]

DESCRIPTION

       tmatch1  performs efficient and flexible crossmatching between the rows of a single table.
       It can match rows on the basis of their relative position in  the  sky,  or  alternatively
       using many other criteria such as separation in in some isotropic or anisotropic Cartesian
       space, identity of a key value, or some combination of these;  the  full  range  of  match
       criteria is dicussed in SUN/256.

       The  basic  task performed by the intra-table matcher is to identify groups of rows within
       the table which match  each  other.  See  SUN/256  for  an  explanation  of  exactly  what
       consitutes a match group. The result of identifying these groups is expressed as an output
       table in one of a variety of ways,  specified  by  the  action  parameter.  These  options
       include  marking  group membership in added columns and eliminating some or all rows which
       form part of a match group.

OPTIONS

       matcher=<matcher-name>
              Defines the nature of the matching that will be performed. Depending  on  the  name
              supplied, this may be positional matching using celestial or Cartesian coordinates,
              exact matching on the value of a  string  column,  or  other  things.  A  list  and
              explanation  of  the  available  matching algorithms is given in SUN/256. The value
              supplied for this parameter determines the meanings of the values required  by  the
              params, values* and tuning parameter(s).

       params=<match-params>
              Determines  the  parameters of this match. This is typically one or more tolerances
              such as error radii. It may contain zero  or  more  values;  the  values  that  are
              required depend on the match type selected by the matcher parameter. If it contains
              multiple values, they must be separated by spaces; values which contain a space can
              be 'quoted' or "quoted".

       tuning=<tuning-params>
              Tuning values for the matching process, if appropriate. It may contain zero or more
              values; the values that are permitted depend on the  match  type  selected  by  the
              matcher  parameter.  If  it  contains  multiple  values,  they must be separated by
              spaces; values which contain a space can be 'quoted' or "quoted". If this  optional
              parameter is not supplied, sensible defaults will be chosen.

       values=<expr-list>
              Defines the values from the input table which are used to determine whether a match
              has occurred. These will typically be coordinate values such  as  RA  and  Dec  and
              perhaps  some per-row error values as well, though exactly what values are required
              is determined by the kind of match as determined by matcher. Depending on the  kind
              of  match,  the  number and type of the values required will be different. Multiple
              values should be separated by whitespace; if  whitespace  occurs  within  a  single
              value it must be 'quoted' or "quoted". Elements of the expression list are commonly
              just column names, but may be algebraic expressions calculated from  zero  or  more
              columns as explained in SUN/256.

       action=identify|keep0|keep1|wide2|wideN
              Determines  the  form of the table which will be output as a result of the internal
              match.

                * identify: The output table is the same  as  the  input  table  except  that  it
                  contains  two  additional  columns,  GroupID and GroupSize, following the input
                  columns. Each group of  rows  which  matched  is  assigned  a  unique  integer,
                  recorded  in  the GroupID column, and the size of each group is recorded in the
                  GroupSize column. Rows which don't match any others (singles) have null  values
                  in both these columns.

                * keep0:  The  result  is a new table containing only "single" rows, that is ones
                  which don't match any other rows in the table. Any other rows are thrown out.

                * keep1: The result is a new table in which only one row (the first in the  input
                  table  order) from each group of matching ones is retained. A subsequent intra-
                  table match with the same criteria would therefore show no matches.

                * wideN: The result is a new "wide" table consisting of matched rows in the input
                  table  stacked  next  to each other. Only groups of exactly N rows in the input
                  table are used to form the output table; each row of the output table  consists
                  of the columns of the first group member, followed by the columns of the second
                  group member and so on. The output table therefore has N times as many  columns
                  as the input table. The column names in the new table have _1, _2, ... appended
                  to them to avoid duplication.

       progress=none|log|time|profile
              Determines whether information on progress of the match should  be  output  to  the
              standard  error  stream  as  it  progresses.  For  lengthy matches this is a useful
              reassurance and can give guidance about how much longer it will take. It  can  also
              be useful as a performance diagnostic.

              The options are:

                * none: no progress is shown

                * log: progress information is shown

                * time: progress information and some time profiling information is shown

                * profile: progress information and limited time/memory profiling information are
                  shown

       runner=parallel|parallel<n>|parallel-all|sequential|classic|partest
              Selects the threading implementation. The options are currently:

                * parallel: uses multithreaded implementation  for  large  tables,  with  default
                  parallelism, which is the smaller of 6 and the number of available processors

                * parallel<n>:   uses   multithreaded   implementation  for  large  tables,  with
                  parallelism given by the supplied value <n>

                * parallel-all: uses  multithreaded  implementation  for  large  tables,  with  a
                  parallelism given by the number of available processors

                * sequential: uses multithreaded implementation but with only a single thread

                * classic: uses legacy sequential implementation

                * partest: uses multithreaded implementation even when tables are small
               The parallel* options should normally run faster than sequential or classic (which
              are provided mainly for testing purposes), at least for  large  matches  and  where
              multiple processing cores are available.

              The  default  value  "parallel"  is  currently  limited to a parallelism of 6 since
              larger values yield diminishing returns given  that  some  parts  of  the  matching
              algorithms  run  sequentially  (Amdahl's  Law),  and  using  too  many  threads can
              sometimes end up doing more work or impacting  on  other  operations  on  the  same
              machine.  But you can experiment with other concurrencies, e.g. "parallel16" to run
              on 16 cores (if available) or "parallel-all" to run on all available cores.

              The value of this parameter should make no difference to the matching  results.  If
              you notice any discrepancies please report them.

       ifmt=<in-format>
              Specifies  the  format  of  the input table as specified by parameter in. The known
              formats are listed in SUN/256. This flag can be used if you know what  format  your
              table is in. If it has the special value (auto) (the default), then an attempt will
              be made to detect the format of the table automatically. This cannot always be done
              correctly  however,  in  which  case the program will exit with an error explaining
              which formats were  attempted.  This  parameter  is  ignored  for  scheme-specified
              tables.

       istream=true|false
              If  set  true,  the  input  table  specified  by the in parameter will be read as a
              stream. It is necessary to give the ifmt parameter in this case. Depending  on  the
              required operations and processing mode, this may cause the read to fail (sometimes
              it is necessary to read the table more than once). It is not normally necessary  to
              set this flag; in most cases the data will be streamed automatically if that is the
              best thing to do. However it can sometimes  result  in  less  resource  usage  when
              processing  large  files  in  certain  formats (such as VOTable). This parameter is
              ignored for scheme-specified tables.

       in=<table>
              The location of the input table. This may take one of the following forms:

                * A filename.

                * A URL.

                * The special value "-", meaning standard input. In this case  the  input  format
                  must  be  given  explicitly using the ifmt parameter. Note that not all formats
                  can be streamed in this way.

                * A scheme specification of the form :<scheme-name>:<scheme-args>.

                * A system command line with either a "<"  character  at  the  start,  or  a  "|"
                  character at the end ("<syscmd" or "syscmd|"). This executes the given pipeline
                  and reads from its standard output. This will probably only work  on  unix-like
                  systems.
               In  any  case,  compressed data in one of the supported compression formats (gzip,
              Unix compress or bzip2) will be decompressed transparently.

       icmd=<cmds>
              Specifies processing to be performed on the input table as specified  by  parameter
              in, before any other processing has taken place. The value of this parameter is one
              or more of the filter commands described in SUN/256. If more  than  one  is  given,
              they  must  be  separated  by  semicolon  characters  (";").  This parameter can be
              repeated multiple times on the same command line to build up a list  of  processing
              steps.  The  sequence of commands given in this way defines the processing pipeline
              which is performed on the table.

              Commands may alteratively be supplied in an external file, by using the indirection
              character  '@'. Thus a value of "@filename" causes the file filename to be read for
              a list of filter commands to execute. The commands in the file may be separated  by
              newline characters and/or semicolons, and lines which are blank or which start with
              a '#' character are ignored.

       ocmd=<cmds>
              Specifies processing  to  be  performed  on  the  output  table,  after  all  other
              processing  has  taken  place.  The  value  of this parameter is one or more of the
              filter commands described in SUN/256. If more than  one  is  given,  they  must  be
              separated  by  semicolon  characters (";"). This parameter can be repeated multiple
              times on the same command line to build up a list of processing steps. The sequence
              of commands given in this way defines the processing pipeline which is performed on
              the table.

              Commands may alteratively be supplied in an external file, by using the indirection
              character  '@'. Thus a value of "@filename" causes the file filename to be read for
              a list of filter commands to execute. The commands in the file may be separated  by
              newline characters and/or semicolons, and lines which are blank or which start with
              a '#' character are ignored.

       omode=out|meta|stats|count|checksum|cgi|discard|topcat|samp|tosql|gui
              The mode in which the result table will be output. The default mode is  out,  which
              means  that  the  result  will  be  written as a new table to disk or elsewhere, as
              determined by the out and ofmt parameters. However, there are other  possibilities,
              which correspond to uses to which a table can be put other than outputting it, such
              as displaying metadata, calculating statistics, or populating a  table  in  an  SQL
              database.  For  some  values of this parameter, additional parameters (<mode-args>)
              are required to determine the exact behaviour.

              Possible values are

                * out

                * meta

                * stats

                * count

                * checksum

                * cgi

                * discard

                * topcat

                * samp

                * tosql

                * gui
               Use the help=omode flag or see SUN/256 for more information.

       out=<out-table>
              The location of the output table. This is usually a filename to write to. If it  is
              equal  to  the  special value "-" (the default) the output table will be written to
              standard output.

              This parameter must only be given if omode has its default value of "out".

       ofmt=<out-format>
              Specifies the format in which the output table will be written (one of the ones  in
              SUN/256 - matching is case-insensitive and you can use just the first few letters).
              If it has the special value "(auto)" (the default), then the output  filename  will
              be examined to try to guess what sort of file is required usually by looking at the
              extension. If it's not obvious from the filename what output format is intended, an
              error will result.

              This parameter must only be given if omode has its default value of "out".

SEE ALSO

       stilts(1)

       If  the  package  stilts-doc  is installed, the full documentation SUN/256 is available in
       HTML format:
       file:///usr/share/doc/stilts/sun256/index.html

VERSION

       STILTS version 3.4.7-debian

       This is the Debian version of Stilts, which lack the support  of  some  file  formats  and
       network protocols. For differences see
       file:///usr/share/doc/stilts/README.Debian

AUTHOR

       Mark Taylor (Bristol University)

                                             Mar 2017                           STILTS-TMATCH1(1)