Provided by: fitsh_0.9.2-1_amd64 bug

NAME

       lfit - general purpose evaluation and regression analysis tool

SYNOPSIS

       lfit [method of analysis] [options] <input> [-o, --output <output>]

DESCRIPTION

       The  program  `lfit`  is  a  standalone  command line driven tool designed for both interactive and batch
       processed data analysis and regression. In principle, the program may run in  two  modes.  First,  `lfit`
       supports  numerous  regression  analysis  methods that can be used to search for "best fit" parameters of
       model functions in order to model the input data (which  are  read  from  one  or  more  input  files  in
       tabulated  form). Second, `lfit` is capable to read input data and performs various arithmetic operations
       as it is specified by the user. Basically this second mode is used to evaluate the model  functions  with
       the  parameters  presumably  derived  by  the  actual  regression  methods (and in order to complete this
       evaluation, only slight changes are needed in the command line invocation arguments).

OPTIONS

   General options:
       -h, --help
              Gives general summary about the command line options.

       --long-help, --help-long
              Gives a detailed list of command line options.

       --wiki-help, --help-wiki, --mediawiki-help, --help-mediawiki
              Gives a detailed list of command line options in Mediawiki format.

       --version, --version-short, --short-version
              Gives some version information about the program.

       --functions, --list-functions, --function-list
              Lists the available arithmetic operations and built-in functions supported by the program.

       --examples
              Prints some very basic examples for the program invocation.

   Common options for regression analysis:
       -v, --variable, --variables <list-of-variables>
              Comma-separated list of regression variables. In case of non-linear regression  analysis,  all  of
              these  fit  variables  are  expected  to  have  some initial values (specified as <name>=<value>),
              otherwise the initial values are  set  to  be  zero.  Note  that  in  the  case  of  some  of  the
              regression/analysis  methods,  additional  parameters  should  be assigned to these fit/regression
              variables. See the section "Regression analysis methods" for additional details.

       -c, --column, --columns <independent>[:<column index>],...
              Comma-separated list of independet variable names as read  from  the  subsequent  columns  of  the
              primary  input  data  file.  If the independent variables are not in sequential order in the input
              file, the optional column indices should be defined for each variable, by  separating  the  column
              index  with  a  colon after the name of the variable. In the case of multiple input files and data
              blocks, the user should assign the individual independent  variables  and  the  respective  column
              names and definitions for each file (see later, Sec. "Multiple data blocks").

       -f, --function <model function>
              Model  function  of the analysis in a symbolic form. This expression for the model function should
              contain built-in arithmetic operators, built-in functions, user-defined macros (see -x,  --define)
              or  functions  provided  by the dynamically loaded external modules (see -d, --dynamic). The model
              function can depend on both the fit/regression variables (see -v, --variables) and the independent
              variables  read  from  the input file (see -c, --columns). In the case of multiple input files and
              data blocks, the user should assign the respective  model  functions  for  each  data  block  (see
              later).  Note  that  some  of  the  analysis  methods  expects  the  model  function  to be either
              differentiable or linear in the fit/regression variables. See "Regression analysis methods"  later
              on about more details.

       -y, --dependent <dependent expression>
              The  dependent  variable  of  the regression analysis, in a form of an arithmetic expression. This
              expression for the dependent variable can depend only on the variables read from  the  input  file
              (see  -c,  --columns). In the case of multiple input files and data blocks, the user should assign
              the respective dependent expressions for each data block (see later).

       -o, --output <output file>
              Name of the output file into which the fit results (the values for the  fit/regression  variables)
              are written.

   Common options for function evaluation:
       -f, --function <function to evaluate>[...]
              List  of  functions  to  be  evaluated. More expressions can be specified by either separating the
              subsequent expressions by a comma or by specifying more -f,  --function  options  in  the  command
              line.

       Note  that the two basic modes of `lfit` are distinguished only by the presence or the absence of the -y,
       --dependent command line argument. In other words, there isn't any explicit command line  argument  which
       specify  the  mode  of  `lfit`.  If  the -y, --dependent command line argument is omitted, `lfit` runs in
       function evaluation mode, otherwise the program runs in regression analysis mode.

       -o, --output <output file>
              Name of the output file in which the results of the function evaluation are written.

   Regression analysis methods:
       -L, --clls, --linear
              The default mode of `lfit`, the classical linear least squares (CLLS) method. The model  functions
              specified  after  -f, --function are expected to be both differentiable and linear with respect to
              the fit/regression variables. Otherwise, `lfit`  detects  the  non-differentiable  and  non-linear
              property  of  the  model  function(s)  and  refuses  the  analysis.  In  this case, other types of
              regression  analysis  methods  can  be   applied   depending   our   needs,   for   instance   the
              Levenberg-Marquardtalgorithm  (NLLM,  see  -N, --nllm) or the downhill simplex minimization (DHSX,
              see -D, --dhsx).

       -N, --nllm, --nonlinear
              This option implies a regression involving the nonlinear Levenberg-Marquardt  (NLLM)  minimization
              algorithm.  The model function(s) specified after -f, --function are expected to be differentiable
              with respect to the fit/regression variables. Otherwise,  `lfit`  detects  the  non-differentiable
              property  and  refuses  the  analysis.  There some fine-tune parameters of the Levenberg-Marquardt
              algorithm, see also the secion "Fine-tuning of regression analysis methods" for more  details  how
              these  additional  regression parameters can be set. Note that all of the fit/regression variables
              should have a proper initial value, defined in the command line argument -v, --variable (see  also
              there).

       -U, --lmnd
              Levenberg-Marquardt  minimization  with  numerical  partial  derivatives  (LMND). Same as the NLLM
              method, with the exception of that the partial derivatives of the model function(s) are calculated
              numerically.  Therefore,  the model function(s) may contain functions of which partial derivatives
              are not known in an analytic form. The  differences  used  in  the  computations  of  the  partial
              derivatives should be declared by the user, see also the command line option -q, --differences.

       -D, --dhsx, --downhill
              This  option  implies  a  regression  involving the nonlinear downhill simplex (DHSX) minimization
              algorithm.  The  user  should  specify  the  proper  inital  values  and  their  uncertainties  as
              <name>=<initial>:<uncertainty>,  unless  the  "fisher"  option  is  passed to the -P, --parameters
              command line argument (see later in the section "Fine-tuning of regression analysis methods").  In
              the first case, the initial size of the simplex is based on the uncertainties provided by the user
              while in the second case, the initial simplex is derived from the eigenvalues and eigenvectors  of
              the  Fisher  covariance matrix. Note that the model functions must be differentiable in the latter
              case.

       -M, --mcmc
              This option implies the method of Markov Chain Monte-Carlo (MCMC). The model  function(s)  can  be
              arbitrary  in  the  point of differentiability. However, each of the fit/regression variables must
              have an initial assumption for their uncertainties which must be specified via  the  command  line
              argument  -v,  --variable.  The  user should specify the proper inital values and uncertainties of
              these as <name>=<initial>:<uncertainty>. In the actual implementation of `lfit`, each variable has
              an  uncorrelated Gaussian a priori distribution with the specified uncertainty. The MCMC algorithm
              has some fine-tune parameters, see the section "Fine-tuning of regression  analysis  methods"  for
              more details.

       -K, --mchi, --chi2
              With this option one can perform a "brute force" Chi^2 minimization by evaluating the value of the
              merit function of Chi^2 on a grid of the fit/regression variables. In this case the grid size  and
              resolution  must  be  specified in a specific form after the -v, --variable command line argument.
              Namely each of the fit/regression variables intended to be varied on a grid must have a format  of
              <name>=[<min>:<step>:<max>]  while  the other ones specified as <name>=<value> are kept fixed. The
              output of this analysis will be  a  series  of  lines  with  N+1  columns,  where  the  values  of
              fit/regression  variables  are  followed  by the value of the merit function. Note that all of the
              declared fit/regression variables are written to the output, including the ones  which  are  fixed
              (therefore the output is somewhat redundant).

       -E, --emce
              This  option  implies  the  method  of  "refitting  to synthetic data sets", or "error Monte-Carlo
              estimation" (EMCE). This method must have a primarily assigned minimization algorithm (that can be
              any  of  the  CLLS, NLLM or DHSX methods). First, the program searches the best fit values for the
              fit/regression variables involving the assigned primary minimization algorithm and  reports  these
              best fit variables. Then, additional synthetic data sets are generated around this set of best fit
              variables and the minimization is repeated involving the same primary method. The  synthetic  data
              sets are generated independently for each input data block, taking into account the fit residuals.
              The noise added to the best fit data is generated from the power spectrum of the residuals.

       -X, --xmmc
              This option implies an improved/extended version of the Markov Chain Monte-Carlo analysis  (XMMC).
              The  major  differences  between  the  classic  MCMC  and  XMMC  methods are the following. 1/ The
              transition distribution is derived from the Fisher covariance matrix. 2/ The program  performs  an
              initial  minimization  of  the merit function involving the method of downhill simplex. 3/ Various
              sanity checks are performed in order to verify the convergence of the Markov chains (including the
              comparison  of  the  actual  and  theoretical  transition  probabilities,  the  computation of the
              autocorrelation lengths  of  each  fit/regression  variable  series  and  the  comparison  of  the
              statistical and Fisher covariance).

       -A, --fima
              Fisher  information  matrix  analysis  (FIMA).  With  this  analysis  method  one can estimate the
              uncertainties and correlations of the fit/regression variables  involving  the  method  of  Fisher
              matrix analysis. This method does not minimize the merit functions by adjusting the fit/regression
              variables, instead, the initial values (specified after the -v, --variables option)  are  expected
              to be the "best fit" ones.

   Fine-tuning of regression analysis methods:
       -e, --error <error expression>
              Expression  for  the  uncertainties.  Note that zero or negative uncertainty is equivalent to zero
              weight, i.e. input lines with zero or negative errors are discarded from the fit.

       -w, --weight <weight expression>
              Expression for the weights. The weight is simply the reciprocal of the  uncertainty.  The  default
              error/uncertainty  (and  therefore the weight) is unity. Note that most of the analysis/regression
              methods are rather sensitive to the uncertainties since the merit function also depends on these.

       -P, --parameters <regression parameters>
              This option is followed by a set of optional fine-tune parameters,  that  is  different  for  each
              primary regression analysis method:

       default, defaults
              Use the default fine-tune parameters for the given regression method.

       clls, linear
              Use  the  classic  linear  least  squares method as the primary minimization algorithm of the EMCE
              method. Like in the case of the CLLS regression analysis (see -L, --clls), the  model  function(s)
              must be both differentiable and linear with respect to the fit/regression variables.

       nllm, nonlinear
              Use  the  non-linear  Levenberg-Marquardt  minimization  algorithm  as  the  primary  minimization
              algorithm of the EMCE method. Like in the case of the NLLM regression analysis (see  -N,  --nllm),
              the model function(s) must be differentiable with respect to the fit/regression variables.

       lmnd   Use  the  non-linear  Levenberg-Marquardt  minimization  algorithm  as  the  primary  minimization
              algorithm of the EMCE method. Like in the case of -U, --lmnd  regression  method,  the  parametric
              derivatives  of  the  model  function(s) are calculated by a numerical approximation (see also -U,
              --lmnd and -q, --differences for additional details).

       dhsx, downhill
              Use the downhill simplex (DHSX) minimization as the primary minimization  algorithm  of  the  EMCE
              method.  Unless  the additional 'fisher' option is specified directly, like in the default case of
              the DHSX regression method, the user  should  specify  the  uncertainties  of  the  fit/regression
              variables that are used as an initial size of the simplex.

       mc, montecarlo
              Use a primitive Monte-Carlo diffusion minimization technique as the primary minimization algorithm
              of the EMCE method. The user should specify the  uncertainties  of  the  fit/regression  variables
              which  are  then used to generate the Monte-Carlo transitions. This primary minimization technique
              is rather nasty (very slow), so its usage is not recommended.

       fisher In the case of the DHSX regression method or in the case of  the  EMCE  method  when  the  primary
              minimization  is  the  downhill simplex algorithm, the initial size of the simplex is derived from
              the Fisher covariance approximation evaluated at the point represented by the  initial  values  of
              the fit/regression variables. Since the derivation of the Fisher covariance requires the knowledge
              of the partial derivatives of the model function(s) with respect to the fit/regression  variables,
              the(se)  model  function(s)  must  be  differentiable.  On the other hand, the user do not have to
              specify the initial uncertainties after the  -v,  --variables  option  since  these  uncertainties
              derived automatically from the Fisher covariance.

       skip   In the case of EMCE and XMMC method, the initial minimization is skipped.

       lambda=<value>
              Initial value for the "lambda" parameter of the Levenberg-Marquardt algorithm.

       multiply=<value>
              Value of the "lambda multiplicator" parameter of the Levenberg-Marquardt algorithm.

       iterations=<max.iterations>
              Number of iterations during the Levenberg-Marquardt algorithm.

       accepted
              Count the accepted transitions in the MCMC and XMMC methods (default).

       nonaccepted
              Count the total (accepted plus non-accepted) transitions in the MCMC and XMMC methods.

       gibbs  Use the Gibbs sampler in the MCMC method.

       adaptive
              Use  the  adaptive  XMMC  algorithm (i.e. the Fisher covariance is re-computed after each accepted
              transition).

       window=<window size>
              Window  size  for  calculating  the  autocorrelation  lengths  for  the   Markov   chains   (these
              autocorrelation  lengths  are  reported only in the case of XMMC method). The default value is 20,
              which is fine in the most cases since the typical autocorrelation lengths are between 1 and 2  for
              nice convergent chains.

       -q, --difference <variablename>=<difference>[,...]
              The analysis method of LMND (Levenberg-Marquardt minimization using numerical derivatives, see -U,
              --lmnd) requires the differences that are used during the computations of the partial  derivatives
              of the model function(s). With this option, one can specify these differences.

       -k, --separate <variablename>[,...]
              In  the  case  of  non-linear  regression  methods (for instance, DHSX or XMMC) the fit/regression
              variables in which the model functions are linear can be separated from  the  nonlinear  part  and
              therefore  make  the  minimization process more robust and reliable. Since the set of variables in
              which the model functions are linear  is  ambiguous,  the  user  should  explicitly  specify  this
              supposedly   linear   subset   of   regression   variables.  (For  instance,  the  model  function
              "a*b*x+a*cos(x)+b*sin(x)+c*x^2" is linear in both "(a,c)" and "(b,c)" parameter vectors but it  is
              non-linear  in "(a,b,c)".) The program checks whether the specified subset of regression variables
              is a linear subset and reports a warning  if  not.  Note  that  the  subset  of  separated  linear
              variables  (defined  here)  and  the  subset  of  the  fit/regression variables affected by linear
              constraints (see also section "Constraints") must be disjoint.

       --perturbations <noise level>, --perturbations <key>=<noise level>[,...]
              Additional white noise to be added to each EMCE synthetic data sets.  Each  data  block  (referred
              here  by  the  approprate  data  block  keys,  see  also  section "Multiple data blocks") may have
              different white noise levels. If there is only one data  block,  this  command  line  argument  is
              followed only by a single number specifying the white noise level.

   Additional parameters for Monte-Carlo analysis:
       -s, --seed <random seed>
              Seed  for  the  random  number  generator.  By default this seed is 0, thus all of the Monte-Carlo
              regression analyses (EMCE, MCMC, XMMC and the optional generator for  the  FIMA  method)  generate
              reproducible parameter distributions. A positive value after this option yields alternative random
              seeds while all negative values result in an automatic random seed (derived from various available
              sources,   such   as  /dev/[u]random,  system  time,  hardware  MAC  address  and  so),  therefore
              distributions generated involving this kind of automatic random seed are not reproducible.

       -i, --[mcmc,emce,xmmc,fima]-iterations <iterations>
              The actual number of Monte-Carlo iterations for the MCMC, EMCE, XMMC  methods.  Additionally,  the
              FIMA  method  is  capable  to generate a mock Gaussian distribution of the parameter with the same
              covariance as derived by the Fisher analysis. The number of points in this  mock  distribution  is
              also specified by this command line option.

   Clipping outlier data points:
       -r, --sigma, --rejection-level <level>
              Rejection level in the units of standard deviations.

       -n, --iterations <number of iterations>
              Maximum  number  of iterations in the outlier clipping cycles. The actual number of outlier points
              can be traced by increasing the verbosity of the program (see -V, --verbose).

       --[no-]weighted-sigma
              During the derivation of the standard deviation, the contribution of the data points  data  points
              can  be weighted by the respective weights/error bars (see also -w, --weight or -e, --error in the
              section "Fine-tuning of regression analysis methods"). If no weights/error bars are associated  to
              the data points (i.e. both -w, --weight or -e, --error options are omitted), this option will have
              no practical effect.

       Note that in the actual version of `lfit`, only the CLLS, NLLM and LMND regression  methods  support  the
       above discussed way of outlier clipping.

   Multiple data blocks:
       -i<key> <input file name>
              Input file name for the data block named as <key>.

       -c<key> <independent>[:<column index>],...
              Column definitions (see also -c, --columns) for the given data block named as <key>.

       -f<key> <model function>
              Expression for the model function assigned to the data block named as <key>.

       -y<key> <dependent expression>
              Expression of the dependent variable for the data block named as <key>.

       -e<key> <errors>
              Expression of the uncertainties for the data block named as <key>.

       -w<key> <weights>
              Expression  of  the  weights  for the data block named as <key>. Note that like in the case of -e,
              --errors and -w, --weights, only one of the -e<key>, -w<key> arguments should be specified.

   Constraints:
       -t, --constraint, --constraints <expression>{=<>}<expression>[,...]
              List of fit  and  domain  constraints  between  the  regression  variables.  Each  fit  constraint
              expression must be linear in the fit/regression variables. The program checks the linearity of the
              fit constraints and reports an  error  if  any  of  the  constraints  are  non-linear.   A  domain
              constraint  can  be  any expression involving arbitrary binary arithmetic relation (such as strict
              greater than: '>', strict less than: '<', greater or equal to: '>=' and less or requal to:  '<=').
              Constraints can be specified either by a comma-separated list after a single command line argument
              of -t, --constraints or by multiple of these command line arguments.

       -v, --variable <name>:=<value>
              Another form of specifying constraints. The variable specifications after -v, --variable can  also
              be used to define constraints by writing ":=" instead of "=" between the variable name and initial
              value. Thus, -v <name>:=<value> is equivalent to -v <name>=<value> -t <name>=<value>.

   User-defined functions:
       -x, --define, --macro <name>(<parameters>)=<definition expression>
              With this option, the user can define additional functions (also called macros) on the top of  the
              built-in  functions  and  operators, dynamically loadaded functions and previously defined macros.
              Note that each such user-defined function must be stand-alone, i.e. external  variables  (such  as
              fit/regression  variables  and independent variables) cannot be part of the definition expression,
              only the parameters of these functions.

   Dynamically loaded extensions and functions:
       -d, --dynamic <library>:<array>[,...]
              Load the dynamically linked  library  (shared  object)  named  <library>  and  import  the  global
              `lfit`-compatible  set of functions defined in the arrays specified after the name of the library.
              The arrays must have to be declared with the type of 'lfitfunction', as it is defined in the  file
              "lfit.h". Each record in this array contains information about a certain imported function, namely
              the actual name of this function, flags specifying whether the function is  differentiable  and/or
              linear  in its regression parameters, the number of regression variables and independent variables
              and the actual C subroutine that implements the evaulation  of  the  function  (and  the  optional
              computation  of  the partial derivatives). The module 'linear.c' and 'linear.so' provides a simple
              example that implements the "line(a,b,x)=a*x+b" function. This example function has two regression
              variables  ("a"  and  "b") and one independent variable ("x") and the function itself is linear in
              the regression variables.

   More on outputs:
       -z, --columns-output <column indices>
              Column indices where the results are written in evaluation mode. If this option  is  omitted,  the
              results  of  the function evaluation are written sequentally. Otherwise, the input file is written
              to the output and the appropriate columns (specified here) are replaced by the respective  results
              of  the  function  evaluation.  Thus,  although the default column order is sequential, there is a
              significant difference between omitting this option and specifying "-z 1,2,...,N".  In  the  first
              case,  the  output file contains only the results of the function evaluations, while in the latter
              case, the first N columns of the original file are replaced with the results.

       --errors, --error-line, --error-columns
              Print the uncertainties of the fit/regression variables.

       -F, --format <variable name>=<format>[,...]
              Format of the output in printf-style for each fit/regression variable(see printf(3)). The  default
              format is %12.6g (6 signifiant figures).

       -F, --format <format>[,...]
              Format of the output in evaluation mode. The default format is %12.6g (6 signifiant figures).

       -C, --correlation-format <format>
              Format of the correlation matrix elements. The default format is %6.3f (3 significant figures).

       -g, --derived-variable[s] <variable name>=<expression>[,...]
              Some  of  the  regression  and  analysis  methods  are  capable  to  compute the uncertainties and
              correlations for derived regression variables. These additional (and  therefore  not  independent)
              variables  can  be  defined with this command line option. In the definition expression one should
              use only the fit/regression variables (as defined by the -v, --variables command  line  argument).
              The  output  format  of  these  variables  can  also be specified by the -F, --format command line
              argument.

       -u, --output-fitted <filename>
              Neme of an output file into which those lines of the input are written that were involved  in  the
              final  regression.  This option is useful in the case of outlier clipping in order to see what was
              the actual subset of input data that was used in the fit (see also the -n,  --iterations  and  -r,
              --sigma options).

       -j, --output-rejected <filename>
              Neme of an output file into which those lines of the input are written that were rejected from the
              final regression. This option is useful in the case of outlier clipping in order to see  what  was
              the  actual subset of input data where the dependent variable represented outlier points (see also
              the -n, --iterations and -r, --sigma options).

       -a, --output-all <filename>
              File containing the lines of the  input  file  that  were  involved  in  the  complete  regression
              analysis. This file is simply the original file, only the commented and empty lines are omitted.

       -p, --output-expression <filename>
              In  this  file the model function is written in which the fit/regression variables are replaced by
              their best-fit values.

       -l, --output-variables <filename>
              List of the names and values of the fit/regression variables in the same format as used after  the
              -v,  --variables  command  line  argument.  The  content  of  this file can therefore be passed to
              subsequent invocations of `lfit`.

       --delta
              Write the individual differences between the independent variables  and  the  evaluated  best  fit
              model  function values for each line in the output files specified by the -u, --output-fitted, -j,
              --output-rejected and -a, --output-all command line options.

       --delta-comment
              Same as --delta, but the differences are written as a comment (i.e. separated by a '##'  from  the
              original input lines).

       --residual
              Write  the  final  fit  residual to the output file (after the list of the best-fit values for the
              fit/regression variables).

REPORTING BUGS

       Report bugs to <apal@szofi.net>, see also http://fitsh.net/.

COPYRIGHT

       Copyright © 1996, 2002, 2004-2008, 2009; Pal, Andras <apal@szofi.net>