Provided by: gmt-common_5.2.1+dfsg-3build1_all bug

NAME

       gmtregress - Linear regression of 1-D data sets

SYNOPSIS

       gmtregress [ table ] [ min/max/inc ] [ level ] [ x|y|o|r ] [ flags ] [ 1|2|r|w ] [ [r] ] [
       min/max/inc | n ] [ [w][x][y][r] ] [ [level] ] [ -a<flags> ] [ -b<binary> ] [ -g<gaps> ] [
       -h<headers> ] [ -i<flags> ] [ -o<flags> ]

       Note: No space is allowed between the option flag and the associated arguments.

DESCRIPTION

       gmtregress  reads  one  or  more  data  tables  [or  stdin] and determines the best linear
       regression model y = a + b* x for each segment using the chosen parameters.  The user  may
       specify which data and model components should be reported.  By default, the model will be
       evaluated at the input points, but alternatively you can specify an equidistant range over
       which  to  evaluate  the model, or turn off evaluation completely.  Instead of determining
       the best fit we can perform a scan of all possible regression lines (for a range of  slope
       angles) and examine how the chosen misfit measure varies with slope.  This is particularly
       useful when analyzing data with many outliers.  Note: If you actually need  to  work  with
       log10 of x or y you can accomplish that transformation during read by using the -i option.

REQUIRED ARGUMENTS

       None

OPTIONAL ARGUMENTS

       table  One  or  more  ASCII (or binary, see -bi[ncols][type]) data table file(s) holding a
              number of data columns. If no tables are given then we read  from  standard  input.
              The first two columns are expected to contain the required x and y data.  Depending
              on your -W and -E settings we may expect  an  additional  1-3  columns  with  error
              estimates of one of both of the data coordinates, and even their correlation.

       -Amin/max/inc
              Instead  of  determining  a  best-fit  regression  we  explore  the  full  range of
              regressions.  Examine all possible regression lines with slope angles  between  min
              and  max,  using  steps  of  inc  degrees  [-90/+90/1].  For each slope the optimum
              intercept is determined based on your regression type (-E)  and  misfit  norm  (-N)
              settings.   For each segment we report the four columns angle, E, slope, intercept,
              for the range of specified angles. The best model parameters within this range  are
              written into the segment header and reported in verbose mode (-V).

       -Clevel
              Set  the  confidence level (in %) to use for the optional calculation of confidence
              bands on the regression [95].  This is only used if -F includes the  output  column
              c.

       -Ex|y|o|r
              Type  of  linear  regression,  i.e., select the type of misfit we should calculate.
              Choose from x (regress x on y; i.e., the misfit is measured horizontally from  data
              point  to  regression  line),  y  (regress  y  on  x;  i.e., the misfit is measured
              vertically [Default]), o (orthogonal regression; i.e., the misfit is measured  from
              data  point  orthogonally  to  nearest point on the line), or r (Reduced Major Axis
              regression; i.e., the misfit  is  the  product  of  both  vertical  and  horizontal
              misfits) [y].

       -Fflags
              Append  a combination of the columns you wish returned; the output order will match
              the order specified.  Choose  from  x  (observed  x),  y  (observed  y),  m  (model
              prediction), r (residual = data minus model), c (symmetrical confidence interval on
              the regression; see -C for specifying the  level),  z  (standardized  residuals  or
              so-called z-scores) and w (outlier weights 0 or 1; for -Nw these are the Reweighted
              Least Squares weights) [xymrczw].  As an alternative to evaluating the model,  just
              give  -Fp  and  we  instead write a single record with the model parameters npoints
              xmean ymean angle misfit slope intercept sigma_slope sigma_intercept.

       -N1|2|r|w
              Selects the norm to use for the misfit calculation.  Choose among 1  (L-1  measure;
              the  mean  of  the  absolute  residuals), 2 (Least-squares; the mean of the squared
              residuals), r (LMS; The  least  median  of  the  squared  residuals),  or  w  (RLS;
              Reweighted  Least  Squares:  the  mean  of  the  squared  residuals  after outliers
              identified via LMS have been removed) [Default is 2].  Traditional regression  uses
              L-2  while  L-1  and in particular LMS are more robust in how they handle outliers.
              As alluded to, RLS implies an initial LMS regression which is then used to identify
              outliers  in  the  data,  assign  these a zero weight, and then redo the regression
              using a L-2 norm.

       -S[r]  Restricts which records will be output.  By default all data records will be output
              in  the  format  specified  by  -F.   Use  -S  to exclude data points identified as
              outliers by the regression.  Alternatively, use -Sr to reverse this and only output
              the outlier records.

       -Tmin/max/inc | -Tn
              Evaluate  the  best-fit  regression  model at the equidistant points implied by the
              arguments.  If -Tn is given instead we will  reset  min  and  max  to  the  extreme
              x-values  for  each  segment  and  determine inc so that there are exactly n output
              values for each segment.  To skip the model  evaluation  entirely,  simply  provide
              -T0.

       -W[w][x][y][r]
              Specifies  weighted  regression  and  which  weights will be provided.  Append x if
              giving  1-sigma  uncertainties  in  the  x-observations,  y   if   giving   1-sigma
              uncertainties  in  y, and r if giving correlations between x and y observations, in
              the order these columns appear in the input (after the two required and leading  x,
              y  columns).   Giving  both  x  and  y  (and  optionally  r)  implies an orthogonal
              regression, otherwise giving x  requires  -Ex  and  y  requires  -Ey.   We  convert
              uncertainties  in  x  and  y  to  regression  weights via the relationship weight =
              1/sigma.  Use -Ww if the we should interpret the input columns to have  precomputed
              weights  instead.   Note:  residuals  with  respect  to the regression line will be
              scaled by the given weights.  Most norms will then square  this  weighted  residual
              (-N1 is the only exception).

       -V[level] (more ...)
              Select verbosity level [c].

       -acol=name[...] (more ...)
              Set aspatial column associations col=name.

       -bi[ncols][t] (more ...)
              Select native binary input.

       -bo[ncols][type] (more ...)
              Select native binary output. [Default is same as input].

       -g[a]x|y|d|X|Y|D|[col]z[+|-]gap[u] (more ...)
              Determine data gaps and line breaks.

       -h[i|o][n][+c][+d][+rremark][+rtitle] (more ...)
              Skip or produce header record(s).

       -icols[l][sscale][ooffset][,...] (more ...)
              Select input columns (0 is first column).

       -ocols[,...] (more ...)
              Select output columns (0 is first column).

       -^ or just -
              Print a short message about the syntax of the command, then exits (NOTE: on Windows
              use just -).

       -+ or just +
              Print  an  extensive  usage  (help)  message,  including  the  explanation  of  any
              module-specific option (but not the GMT common options), then exits.

       -? or no arguments
              Print  a  complete usage (help) message, including the explanation of options, then
              exits.

       --version
              Print GMT version and exit.

       --show-datadir
              Print full path to GMT share directory and exit.

ASCII FORMAT PRECISION

       The ASCII output formats of numerical data are controlled by parameters in  your  gmt.conf
       file.  Longitude  and  latitude  are  formatted according to FORMAT_GEO_OUT, whereas other
       values are formatted according to FORMAT_FLOAT_OUT. Be aware that the format in effect can
       lead to loss of precision in the output, which can lead to various problems downstream. If
       you find the output is not written with enough precision,  consider  switching  to  binary
       output (-bo if available) or specify more decimals using the FORMAT_FLOAT_OUT setting.

EXAMPLES

       To  do  a standard least-squares regression on the x-y data in points.txt and return x, y,
       and model prediction with 99% confidence intervals, try

              gmt regress points.txt -Fxymc -C99 > points_regressed.txt

       To just get the slope for the above regression, try

              slope=`gmt regress points.txt -Fp -o5`

       To do a reweighted least-squares regression on the data rough.txt and return x,  y,  model
       prediction and the RLS weights, try

              gmt regress rough.txt -Fxymw > points_regressed.txt

       To  do  an  orthogonal  least-squares  regression on the data crazy.txt but first take the
       logarithm of both x and y, then return x, y, model prediction and the normalized residuals
       (z-scores), try

              gmt regress crazy.txt -Eo -Fxymz -i0-1l > points_regressed.txt

       To examine how the orthogonal LMS misfits vary with angle between 0 and 90 in steps of 0.2
       degrees for the same file, try

              gmt regress points.txt -A0/90/0.2 -Eo -Nr > points_analysis.txt

REFERENCES

       Draper, N. R., and H. Smith, 1998, Applied regression analysis, 3rd  ed.,  736  pp.,  John
       Wiley and Sons, New York.

       Rousseeuw, P. J., and A. M. Leroy, 1987, Robust regression and outlier detection, 329 pp.,
       John Wiley and Sons, New York.

       York, D., N. M. Evensen, M. L. Martinez, and J. De Basebe Delgado, 2004, Unified equations
       for  the  slope,  intercept,  and standard errors of the best straight line, Am. J. Phys.,
       72(3), 367-375.

SEE ALSO

       gmt, trend1d, trend2d

COPYRIGHT

       2015, P. Wessel, W. H. F. Smith, R. Scharroo, J. Luis, and F. Wobbe