Provided by: fitsh_0.9.2-1_amd64 bug

NAME

       grmatch - pairing lines by involving identifier or cross matching

SYNOPSIS

       grmatch [options] -r <reference> -i <input> [-o <output>]

DESCRIPTION

       The program `grmatch` matches lines read from two input files, namely from a reference and
       from an input file. All implemented algorithms are  symmetric,  in  the  manner  that  the
       result  should be the same if these two files are swapped. The only case when the order of
       these files is important is when a geometrical transformation is also returned (see  point
       matching  below),  in  this case the swapping of the files results the inverse form of the
       original transformation. The lines (rows) can be matched using various criteria. 1.  Lines
       can  be matched by identifier, where the identifier can be any concatenation of arbitrary,
       space-separated columns found in the files. Generally, the identifier is represented by  a
       single  column  (e.g.  it  is  an  astronomical  catalog identifier). The behaviour of the
       program can be tuned for the cases when there  are  more  than  one  rows  with  the  same
       identifier. 2. Lines can be matched using a 2-dimensional point matchig algorithm. In this
       method, the program expects two-two columns both from the  reference and input files which
       can be treated as X and Y coordinates. If both point lists are known, the program tries to
       find the appropriate geometrical transformation  which  transforms  the  points  from  the
       frame  of the reference list to the frame of the input list and,  simultaneously, tries to
       find as many pairs as possible. The  parameters of the geometrical transformation and  the
       whole  algorithm  can  be  fine-tuned.  3.  Lines  can  be  matched  using arbitrary- (N-)
       dimensional coordinate matching algorithm. This method expects N-N columns both  from  the
       reference  and input files which can be treated as X_1, ..., X_N Cartesian coordinates and
       the method assumes both of the point sets in the same reference frame. The point 'A'  from
       the  reference  list  and   the  point 'P' from the input list forms a pair if the closest
       point to 'A' from the input list is 'P' and vice versa.

OPTIONS

   General options:
       -h, --help
              Give general summary about the command line options.

       --long-help, --help-long
              Gives a detailed list of command line options.

       --wiki-help, --help-wiki, --mediawiki-help, --help-mediawiki
              Gives a detailed list of command line options in Mediawiki format.

       --version, --version-short, --short-version
              Give some version information about the program.

       -C, --comment
              Comment the output (both the transformation file and the match file).

   Options for input/output specifications:
       -r <referencefile>, --input-reference <referencefile>
              Mandatory, name of the reference file.

       <inputfile>, -i <inputile>, --input <inputfile>
              Name of the input file. If this switch is omitted,  the  input  isread  from  stdin
              (specifying some input is mandatory).

       -o <output>, --output <output>, --output-matched <output>
              Name of the output file, containing the matched lines. The matched lines are pasted
              lines, the first part is from the reference file and the second part  is  from  the
              input  file,  these  two  parts are concatenated by a TAB character. This switch is
              optional, if it is not specified, no such output will be generated.

       --output-excluded-reference <out>, --output-excluded-input <out>
              Names of the files which contain the valid but excluded lines  from  the  reference
              and  from  the  input.  These  outputs  are  disjoint  from the previous output and
              altogether contaions all valid lines.

       --output-id <out>
              Name of the file which contaions only the identifiers of the matched lines. If  the
              primary  matching method was not identifier matching, one should specify the column
              indices of the identifiers by --col-ref-id and --col-inp-id also.

       --output-transformation <output-transformation-file>
              Name  of  the  output  file  containing  the    geometrical    transformation,   in
              human-readable  format,  if  the matching method was point matching (in other case,
              this option has no  effect).  The  commented  version  of this file  includes  some
              statistics  about  the  matching (the total  number  of  lines  used  and  matched,
              the required CPU time, the final triangulation level, the fit residuals  and  other
              things like these).

       In all of the above input/output file specifications,  the  replacement of  the  file name
       by "-" (a single minus sign) forces the reading from stdin or writing to stdout. Note that
       all  parts  of  the  any   line   after "#" (hashmark) are treated as a comment, therefore
       ignored.

   General options for point matching:
       --match-points
              This  switch  forces  the usage of the point  matching  method.  By  default,  this
              method is  assumed  to  be  used,  therefore  this switch can be omitted.

       --col-ref <x>,<y>, --col-inp <x>,<y>
              The   column  indices containing the X and Y coordinates, for the reference and for
              the input file, respectively. The index of the first  column   is   always  1,  the
              index  of  the  second  is 2 and so on. Lines in which these columns do not contain
              valid real numbers bers are omitted.

       -a <order>, --order <order>
              This   switch  specifies  the  polynomial  order  of   the   resulted   geometrical
              transformation.  It  can be arbitrary  positive  integer. Note that if the order is
              A, at least (A+1)*(A+2)/2 valid points are needed both from the reference and  both
              from the input  file to fit the transformation.

       --max-distance <maxdist>
              The   maximal  accepted distance between the matched points in the coordinate frame
              of the input coordinate list (and not in the  coordinate  frame  of  the  reference
              coordinate  list).  Possible  pairs  (which  are  valid pairs due to the  symmetric
              coordinate  matching  algorihms) are excluded if their Eucledian distance is larger
              than  maxdist.  Note that  this option has no initial value, therefore, if omitted,
              all possible pairs due to the symmetric matching are resulted,  which,  in  certain
              cases   in  practice,  can result unexpected behaviour. One should always specify a
              reasonable maximal distance which can be estimated  only  by   the   knowledge   of
              the physics of the input files.

       See  more options concerning to point  matching  in  the  section "Fine-Tuning   of  Point
       Matching" below. That  section  also describes the tuning of the  triangulation  used   by
       the  point matching  algorithm.  For  a more detailed description about the point matching
       algorithms based on pattern and triangle matching see [1], [2] or [3].

   General options for coordinate matching:
       --match-coord, --match-coords
              This  switch forces the usage of the coordinate matching method. Note that  because
              of  the  common  options  with  the point  matching method, one should specify this
              switch to force the usage of the coordinate matching method (the default method  is
              point  matching, see above).

       --col-ref <x>[,<y>,[<z>...]] --col-inp <x>[,<y>,[<z>...]]
              The   column  indices containing the spatial coordinates, for the reference and for
              the input file, respectively. The index of the first  column   is   always  1,  the
              index  of  the  second  is 2 and so on. Lines in which these columns do not contain
              valid real  numbers are  omitted.  Note  that   the  dimension  of  the  coordinate
              matching  space  is  specified indirectly, by the number of  column indices  listed
              here.  Because  of  this,  the number of column indices should be the same for  the
              reference  and  input,  in other case,  when  the  dimensions  are  mismatched, the
              program exits unsuccessfully.

       --max-distance <maxdist>
              The maximal accepted distance between the matched points.  Possible   pairs  (which
              are valid pairs due to the symmetric coordinate matching algorihms) are excluded if
              their  Eucledian distance  is larger than maxdist. Note that  this  option  has  no
              initial  value,  therefore,  if  omitted,  all possible pairs due to the  symmetric
              matching  are  resulted (see also point matching, above).

   General options for identifier matching:
       --match-id, --match-identifiers
              This switch forces the usage of the identifier matching  method.

       --col-ref-id <i>[,<j>,[<k>...]] --col-inp-id <i>[,<j>,[<k>...]]
              Column  index  or  indices  containing the identifiers, from the reference and from
              the input file, respectively.

       --no-ambiguity, --first-ambiguity, --any-ambiguity, --full-ambiguity
              These  options  tune  the  behaviour  of the matching when  there is more  than one
              occurrence  of  a  given  identifier  in  the  reference  and/or  input  file.   If
              --no-ambiguity is specified, these  identifiers  are discarded, this is the default
              method.  If --first-ambiguity is specified, only the first occurence is treated  as
              a  matched   line,  independently  from  the  number of occurrences.  If the switch
              --any-ambiguity is specified, the lines  are  paired sequentally,  until  there  is
              any  left  from  the  reference  and  from  the  input.  For example, if there is 4
              occurrences in the reference  and  6  in the input file of a  given  identifier,  4
              matched  pairs  are returned.  Otherwise, if  --full-ambiguity  is  specified,  all
              possible  combinations  of  the lines are treated as matched lines. For example, if
              there  is  4  occurrences  in  the reference  and  6  in  the input file of a given
              identifier, all 4*6=24 combinations are returned as matched pairs.

   Fine-tuning of point matching:
       --triangulation <parameters>
              This switch  is  followed   by   comma-separated   directives,  which  specify  the
              parameters of the triangulation-based point matching algorithm:

       delaunay, level=<level>, full, auto, unitarity=<U>
              These   directives  specify  the  triangulation  level  used  for  point  matching.
              "delaunay" forces the usage only of the Delaunay-triangles.  This  is  the  fastest
              method,  however, it is only working if the points in the reference and input lists
              are almost  competely  overlapping  and  describe  almost   the   same  point  sets
              (within  a  ratio  of  common  points  above  60-70%).  The  "level" specifies  the
              level of the expansion of the Delaunay-triangulation (see [1]  for  more  details).
              In   practice,   the   lower   the  ratio  of common points and/or the ratio of the
              overlapping, the higher level should be used.  Specifying "level=1"  or   "level=2"
              gives   a   robust  but  still  fast method for general usage. The directive "full"
              forces full triangulation.  This can  be  overwhelmingly  slow  and  annoying   and
              requires  tons  of memory if there are more than 40-50 points (the amounts of these
              resources are  proportional  to  the 6th(!) and 3rd power  of  the  number  of  the
              points,  respectively).  The  directive   "auto"   increases   the  level   of  the
              triangulation  expansion  automatically  until a proper match is found. A match  is
              considered  as a good match if the unitarity of the transformation is less than the
              unitarity U specified by  the  "unitarity=U"  directive  (see  also   the   section
              Notes/Unitarity below).

       mixed, conformable, reverse
              These  directives  define  the  chirality  of  the  triangle  spaces  to  be  used.
              Practically, it means the following. If  we  don't   know  whether  the  input  and
              reference  lists  are  inverted  respecting  to  each other, one should use "mixed"
              triangle  space.  If  we  are sure  about that the input and  reference  lists  are
              not  inverted,  we  can  use  "conformable" triangle space. If  we  know  that  the
              input  and  reference  lists  are inverted, we can use "reverse" space.  Note  that
              although  "mixed"  triangle  space  can  always result  a  good match, it is a wise
              idea to fix the chirality by specifying "conformable" or  "reverse"  if  we  really
              know  that  the  point   sets   are  not  inverted  or  inverted respecting to each
              other. If the  chirality  is  fixed,  the  program   yields  more  matched   pairs,
              the   appropriate   triangulation   level   can  be smaller and in "auto" mode, the
              program returns the match  definitely faster.

       maxnumber=<max>, maxref=<mr>, maxinp=<mi>
              These  directives  specify  the  maximal  number  of  points  which  are  used  for
              triangulation  (for   any  type  of  triangulation). If "maxnumber"  is  specified,
              it is equivalent to define "maxref" and "maxinp" with the same values.  Then,   the
              first   <mr>   points from  the  reference and the first <mi> points from the input
              list are used to generate the triangle sets.  The  "first"   points   are  selected
              using   the   optional  information  found in one of the columns, see the following
              switches.

       (Note that there should be only one --triangulation switch, all desired directives  should
       be  written in the same argument, separated by commas.)

       --col-ref-ordering [-]<w>, --col-inp-ordering [-]<w>.
              These switches specify one-one column index from  the  reference and from the input
              files which are used to order  these  lists  and  select  the  first  "maxref"  and
              "maxinp"  points   (see   above)   for the  generation  of the two triangle meshes.
              Both columns should contain valid real  numbers,  otherwise   the   whole(!)   line
              is excluded (not only from sorting but from the whole matching procedure). If there
              is  no  negative  sign  before  the   column   index,  the   data  are  sorted   in
              descending(!)  order, therefore the lines with the lines with the highest(!) values
              are selected for  triangulation.  If  there  is a negative sign before  the  index,
              the  data  are  sorted  in ascending order by  these  values,  therefore the  lines
              with the smallest(!) values are selected for triangulation. For example, if we want
              to  match  star   lists,   we   might  want   to   use  only  the brightest ones to
              generate the triangle sets. If the brightnesses of the  stars  are   specified   by
              their  fluxes,   we  should not use the negative sign (the list should be sorted in
              descending order to select the first few lines as  the brightest  stars),   and  if
              the brightness is known by the magnitude, we have to use the negative sign.

       --fit iterations=<N>,firstrejection=<F>,sigma=<S>
              Like  --triangulation,  this  switch  is   followed   by   some  directives.  These
              directives  specify  the  number <N> of iterations ("iterations=<N>")   for   point
              matching.   The  "firstrejection" directive  speciy  the  serial  number <F> of the
              first iteration where points farer than <S> "sigma" level are excluded in the  next
              iteration.   Note   that   in   practice   these  type  of  iteration is really not
              important  (due  to,  for  instance,  the  limitations  of  the  outliers  by   the
              --max-distance  switch),  however,  some  suspicious users can be convinced by such
              arguments.

       --weight reference|input,column=<wi>,[magnitude],[power=<p>]
              These  directives  specify the weights  which  are  used  during  the  fit  of  the
              geometrical  transformation.  For  example,  in   practice  it   is  useful  in the
              following situation. We try to  match  star  lists,  then  the  fainter  stars  are
              believed  to  have  higher astrometrical errors, therefore they should have smaller
              influence in the fit. We can take  the  weights   from   the   reference   (specify
              "reference") and from the input (specify "input"), from the column specified by the
              weight-index. The weights  can  be   derived  from   stellar   magnitudes,  if  so,
              specify  "magnitude"  to  convert  the  read  values in magnitude to flux. The real
              weights  then  is the  "power"th  power  of  the  flux.  The  default value of  the
              "power"  is  1,  however,  for  the  maximum-likelihood  estimation   of an assumed
              Gaussian distribution, the weights should be the second power of the fluxes.

       Some notes on unitarity.  The unitarity of a geometrical transformation measures   how  it
       differs  from  the  closest  transformation which is affine and a combination of dilation,
       rotation and  shift.  For  such  a  transformation   the  unitarity   is   0  and  if  the
       second-order  terms  in  a  transformation  distort  a  such  unitary  transformation, the
       unitarity will  have  the  same magnitude  like the magnitude of this second-order effect.
       For  example, to map a part of a sphere with the size of d degrees will have an  unitarity
       of 1-cos(d). Therefore, for astrometrical purposes, a reasonable  value  of  the  critical
       unitarity  in  "auto"  triangulation  mode  can  be estimated  as  2 or 3 times 1-cos(d/2)
       where d is the size of the field in which astrometry should be performed.

REPORTING BUGS

       Report bugs to <apal@szofi.net>, see also http://fitsh.net/.

COPYRIGHT

       Copyright © 1996, 2002, 2004-2008, 2010-2015; Pal, Andras <apal@szofi.net>