Ubuntu Manpage: v.surf.rst - Spatial approximation and topographic analysis from given point or isoline

NAME

       v.surf.rst   -  Spatial approximation and topographic analysis from given point or isoline
       data in vector format to floating  point  raster  format  using  regularized  spline  with
       tension.

KEYWORDS

       vector, interpolation

SYNOPSIS

       v.surf.rst
       v.surf.rst help
       v.surf.rst   [-zctd]   input=name    [layer=integer]     [where=sql_query]     [elev=name]
       [slope=name]   [aspect=name]   [pcurv=name]   [tcurv=name]   [mcurv=name]    [devi=string]
       [cvdev=name]     [treefile=name]     [overfile=name]    [maskmap=name]    [zcolumn=string]
       [tension=float]   [smooth=float]   [scolumn=string]    [segmax=integer]    [npmin=integer]
       [dmin=float]       [dmax=float]       [zmult=float]      [theta=float]      [scalex=float]
       [--overwrite]  [--verbose]  [--quiet]

   Flags:
       -z
           Use z-coordinates (3D vector only)

       -c
           Perform cross-validation procedure without raster approximation

       -t
           Use scale dependent tension

       -d
           Output partial derivatives instead of topographic parameters

       --overwrite
           Allow output files to overwrite existing files

       --verbose
           Verbose module output

       --quiet
           Quiet module output

   Parameters:
       input=name
           Name of input vector map

       layer=integer
           Layer number
           If set to 0, z coordinates are used. (3D vector only)
           Default: 1

       where=sql_query
           WHERE conditions of SQL statement without 'where' keyword
           Example: income = 10000

       elev=name
           Output surface raster map (elevation)

       slope=name
           Output slope raster map

       aspect=name
           Output aspect raster map

       pcurv=name
           Output profile curvature raster map

       tcurv=name
           Output tangential curvature raster map

       mcurv=name
           Output mean curvature raster map

       devi=string
           Output deviations vector point file

       cvdev=name
           Output cross-validation errors vector point file

       treefile=name
           Output vector map showing quadtree segmentation

       overfile=name
           Output vector map showing overlapping windows

       maskmap=name
           Name of the raster map used as mask

       zcolumn=string
           Name of the attribute column with values to be used for approximation (if layer>0)

       tension=float
           Tension parameter
           Default: 40.

       smooth=float
           Smoothing parameter

       scolumn=string
           Name of the attribute column with smoothing parameters

       segmax=integer
           Maximum number of points in a segment
           Default: 40

       npmin=integer
           Minimum number of points for approximation in a segment (>segmax)
           Default: 300

       dmin=float
           Minimum distance between points (to remove almost identical points)
           Default: 0.500000

       dmax=float
           Maximum distance between points on isoline (to insert additional points)
           Default: 2.500000

       zmult=float
           Conversion factor for values used for approximation
           Default: 1.0

       theta=float
           Anisotropy angle (in degrees counterclockwise from East)

       scalex=float
           Anisotropy scaling factor

DESCRIPTION

v.surf.rst
This program performs spatial approximation based on z-values (-z flag or layer=0
parameter) or attributes (zcolumn parameter) of point or isoline data given in a vector
map named input to grid cells in the output raster map elev representing a surface. As an
option, simultaneously with approximation, topographic parameters slope, aspect, profile
curvature (measured in the direction of the steepest slope), tangential curvature
(measured in the direction of a tangent to contour line) or mean curvature are computed
and saved as raster maps specified by the options slope, aspect, pcurv, tcurv, mcurv
respectively. If -d flag is set, the program outputs partial derivatives fx,fy,fxx,
fyy,fxy instead of slope, aspect, profile, tangential and mean curvatures respectively.

User can define a raster map named maskmap, which will be used as a mask. The
approximation is skipped for cells which have zero or NULL value in mask. NULL values will
be assigned to these cells in all output raster maps. Data points are checked for
identical points and points that are closer to each other than the given dmin are removed.
If sparsely digitized contours or isolines are used as input, additional points are
computed between each 2 points on a line if the distance between them is greater than
specified dmax. Parameter zmult allows user to rescale the values used for approximation
(useful e.g. for transformation of elevations given in feet to meters, so that the proper
values of slopes and curvatures can be computed).

Regularized spline with tension is used for the approximation. The tension parameter tunes
the character of the resulting surface from thin plate to membrane. Smoothing parameter
smooth controls the deviation between the given points and the resulting surface and it
can be very effective in smoothing noisy data while preserving the geometrical properties
of the surface. With the smoothing parameter set to zero (smooth=0) the resulting surface
passes exactly through the data points (spatial interpolation is performed). When
smoothing parameter is used, it is also possible to output a vector point file devi
containing deviations of the resulting surface from the given data.

If the number of given points is greater than segmax, segmented processing is used . The
region is split into quadtree-based rectangular segments, each having less than segmax
points and approximation is performed on each segment of the region. To ensure smooth
connection of segments the approximation function for each segment is computed using the
points in the given segment and the points in its neighborhood which are in the
rectangular window surrounding the given segment. The number of points taken for
approximation is controlled by npmin, the value of which must be larger than segmax. User
can choose to output vector maps treefile and overfile which represent the quad tree used
for segmentation and overlapping neighborhoods from which additional points for
approximation on each segment were taken.

Predictive error of surface approximation for given parameters can be computed using the
-c flag. A crossvalidation procedure is then performed using the data given in the vector
map input and the estimated predictive errors are stored in the vector point file cvdev.
When using this flag, no raster output files are computed. Anisotropic surfaces can be
interpolated by setting anisotropy angle theta and scaling factor scalex. The program
writes values of selected input and internally computed parameters to the history file of
raster map elev.

NOTES

v.surf.rst uses regularized spline with tension for approximation from vector data. The
module does not require input data with topology, therefore both level1 (no topology) and
level2 (with topology) vector point data are supported. Additional points are used for
approximation between each 2 points on a line if the distance between them is greater than
specified dmax. If dmax is small (less than cell size) the number of added data points can
be vary large and slow down approximation significantly. The implementation has a
segmentation procedure based on quadtrees which enhances the efficiency for large data
sets. Special color tables are created by the program for output raster maps.

Topographic parameters are computed directly from the approximation function so that the
important relationships between these parameters are preserved. The equations for
computation of these parameters and their interpretation is described in Mitasova and
Hofierka, 1993 or Neteler and Mitasova, 2004). Slopes and aspect are computed in degrees
(0-90 and 1-360 respectively). The aspect raster map has value 0 assigned to flat areas
(with slope less than 0.1%) and to singular points with undefined aspect. Aspect points
downslope and is 90 to the North, 180 to the West, 270 to the South and 360 to the East,
the values increase counterclockwise. Curvatures are positive for convex and negative for
concave areas. Singular points with undefined curvatures have assigned zero values.

Tension and smoothing allow user to tune the surface character. For most landscape scale
applications the default values should provide adequate results. The program gives
warning when significant overshoots appear in the resulting surface and higher tension or
smoothing should be used. To select parameters that will produce a surface with desired
properties, it is useful to know that the method is scale dependent and the tension works
as a rescaling parameter (high tension "increases the distances between the points" and
reduces the range of impact of each point, low tension "decreases the distance" and the
points influence each other over longer range). Surface with tension set too high behaves
like a membrane (rubber sheet stretched over the data points) with peak or pit ("crater")
in each given point and everywhere else the surface goes rapidly to trend. If digitized
contours are used as input data, high tension can cause artificial waves along contours.
Lower tension and higher smoothing is suggested for such a case.
Surface with tension set too low behaves like a stiff steel plate and overshoots can
appear in areas with rapid change of gradient and segmentation can be visible. Increase in
tension should solve the problems.
There are two options how tension can be applied in relation to dnorm (dnorm rescales the
coordinates depending on the average data density so that the size of segments with
segmax=40 points is around 1 - this ensures the numerical stability of the computation):

1. Default: the given tension is applied to normalized data (x/dnorm..), that means that
the distances are multiplied (rescaled) by tension/dnorm. If density of points is changed,
e.g., by using higher dmin, the dnorm changes and tension needs to be changed too to get
the same result. Because the tension is applied to normalized data its suitable value is
usually within the 10-100 range and does not depend on the actual scale (distances) of the
original data (which can be km for regional applications or cm for field experiments).
2. Flag -t : The given tension is applied to un-normalized data (rescaled tension =
tension*dnorm/1000 is applied to normalized data (x/dnorm) and therefore dnorm cancels
out) so here tension truly works as a rescaling parameter. For regional applications with
distances between points in km. the suitable tension can be 500 or higher, for detailed
field scale analysis it can be 0.1. To help select how much the data need to be rescaled
the program writes dnorm and rescaled tension fi=tension*dnorm/1000 at the beginning of
the program run. This rescaled tension should be around 20-30. If it is lower or higher,
the given tension parameter should be changed accordingly.

The default is a recommended choice, however for the applications where the user needs to
change density of data and preserve the approximation character the -t flag can be
helpful.

Anisotropic data (e.g. geologic phenomena) can be interpolated using theta and scalex
defining orientation and ratio of the perpendicular axes put on the longest/shortest side
of the feature, respectively. Theta is measured in degrees from East, counterclockwise.
Scalex is a ratio of axes sizes. Setting scalex in the range 0-1, results in a pattern
prolonged in the direction defined by theta. Scalex value 0.5 means that modeled feature
is approximately 2 times longer in the direction of theta than in the perpendicular
direction. Scalex value 2 means that axes ratio is reverse resulting in a pattern
perpendicular to the previous example. Please note that anisotropy option has not been
extensively tested and may include bugs (for example , topographic parameters may not be
computed correctly) - if there are problems, please report to GRASS bugtracker (accessible
from http://grass.osgeo.org/).

For data with values changing over several magnitudes (sometimes the concentration or
density data) it is suggested to interpolate the log of the values rather than the
original ones.

The program checks the numerical stability of the algorithm by computing the values in
given points, and prints the root mean square deviation (rms) found into the history file
of raster map elev. For computation with smoothing set to 0. rms should be 0. Significant
increase in tension is suggested if the rms is unexpectedly high for this case. With
smoothing parameter greater than zero the surface will not pass exactly through the data
points and the higher the parameter the closer the surface will be to the trend. The rms
then represents a measure of smoothing effect on data. More detailed analysis of smoothing
effects can be performed using the output deviations option.

SQL support
Using the where parameter, the interpolation can be limited to use only a subset of the
input vectors.

Spearfish example (we simulate randomly distributed elevation measures):
g.region rast=elevation.10m -p
# random elevation extraction
r.random elevation.10m vector_output=elevrand n=200
v.info -c elevrand
v.db.select elevrand
# interpolation based on all points
v.surf.rst elevrand zcol=value elev=elev_full
r.colors elev_full rast=elevation.10m
d.rast elev_full
d.vect elevrand
# interpolation based on subset of points (only those over 1300m/asl)
v.surf.rst elevrand zcol=value elev=elev_partial where="value > 1300"
r.colors elev_partial rast=elevation.10m
d.rast elev_partial
d.vect elevrand where="value > 1300"

Cross validation procedure
The "optimal" approximation parameters for given data can be found using a cross-
validation (CV) procedure (-c flag). The CV procedure is based on removing one input data
point at a time, performing the approximation for the location of the removed point using
the remaining data points and calculating the difference between the actual and
approximated value for the removed data point. The procedure is repeated until every data
point has been, in turn, removed. This form of CV is also known as the "leave-one-out" or
"jack-knife" method (Hofierka et al., 2002; Hofierka, 2005). The differences (residuals)
are then stored in the cvdev output vector map. Please note that during the CV procedure
no other output files can be set, the approximation is performed only for locations
defined by input data. To find "optimal parameters", the CV procedure must be iteratively
performed for all reasonable combinations of the approximation parameters with small
incremental steps (e.g. tension, smoothing) in order to find a combination with minimal
statistical error (also called predictive error) defined by root mean squared error
(RMSE), mean absolute error (MAE) or other error characteristics. A script with loops for
tested RST parameters can do the job, necessary statistics can be calculated using e.g.
v.univar. It should be noted that crossvalidation is a time-consuming procedure, usually
reasonable for up to several thousands of points. For larger data sets, CV should be
applied to a representative subset of the data. The cross-validation procedure works well
only for well-sampled phenomena and when minimizing the predictive error is the goal. The
parameters found by minimizing the predictive (CV) error may not not be the best for for
poorly sampled phenomena (result could be strongly smoothed with lost details and
fluctuations) or when significant noise is present that needs to be smoothed out.

The program writes the values of parameters used in computation into the comment part of
history file elev as well as the following values which help to evaluate the results and
choose the suitable parameters: minimum and maximum z values in the data file (zmin_data,
zmax_data) and in the interpolated raster map (zmin_int, zmax_int), rescaling parameter
used for normalization (dnorm), which influences the tension.

If visible connection of segments appears, the program should be rerun with higher npmin
to get more points from the neighborhood of given segment and/or with higher tension.

When the number of points in a vector map is not too large (less than 800), the user can
skip segmentation by setting segmax to the number of data points or segmax=700.

The program gives warning when user wants to interpolate outside the rectangle given by
minimum and maximum coordinates in the vector map, zoom into the area where the given data
are is suggested in this case.

When a mask is used, the program takes all points in the given region for approximation,
including those in the area which is masked out, to ensure proper approximation along the
border of the mask. It therefore does not mask out the data points, if this is desirable,
it must be done outside v.surf.rst.

For examples of applications see GRASS4 implementation and GRASS5 and GRASS6
implementation.

The user must run g.region before the program to set the region and resolution for
approximation.

AUTHORS

       Original version of program (in FORTRAN) and GRASS enhancements:
       Lubos Mitas, NCSA, University of Illinois at Urbana Champaign, Illinois, USA  (1990-2000);
       Department of Physics, North Carolina State University, Raleigh
       Helena  Mitasova,  USA  CERL,  Department  of Geography, University of Illinois at Urbana-
       Champaign, USA (1990-2001); MEAS, North Carolina State University, Raleigh

       Modified program (translated to C, adapted for GRASS, new segmentation procedure):
       Irina Kosinovsky, US Army CERL, Dave Gerdes, US Army CERL

       Modifications for new sites format and timestamping:
       Darrel McCauley, Purdue University, Bill Brown, US Army CERL

       Update for GRASS5.7, GRASS6 and addition of crossvalidation: Jaroslav Hofierka, University
       of Presov; Radim Blazek, ITC-irst

REFERENCES

       Mitasova,  H.,  Mitas,  L.  and  Harmon, R.S., 2005, Simultaneous spline approximation and
       topographic analysis for lidar elevation data in open source GIS, IEEE GRSL  2  (4),  375-
       379.

       Hofierka,  J.,  2005,  Interpolation  of  Radioactivity Data Using Regularized Spline with
       Tension. Applied GIS, Vol. 1, No. 2, pp. 16-01 to 16-13. DOI: 10.2104/ag050016

       Hofierka J., Parajka J.,  Mitasova H.,  Mitas  L.,  2002,  Multivariate  Interpolation  of
       Precipitation  Using  Regularized  Spline  with  Tension.   Transactions  in GIS 6(2), pp.
       135-150.

       H. Mitasova, L. Mitas, B.M. Brown, D.P. Gerdes, I. Kosinovsky,  1995,  Modeling  spatially
       and  temporally  distributed phenomena: New methods and tools for GRASS GIS. International
       Journal of GIS, 9 (4), special  issue  on  Integrating  GIS  and  Environmental  modeling,
       433-446.

       Mitasova,  H.  and  Mitas,  L., 1993: Interpolation by Regularized Spline with Tension: I.
       Theory and Implementation, Mathematical Geology ,25, 641-655.

       Mitasova, H. and Hofierka, J., 1993: Interpolation by Regularized Spline with Tension: II.
       Application  to  Terrain  Modeling and Surface Geometry Analysis, Mathematical Geology 25,
       657-667.

       Mitas, L., and Mitasova H., 1988,   General  variational  approach  to  the  approximation
       problem, Computers and Mathematics with Applications, v.16, p. 983-992.

       Neteler,  M.  and  Mitasova, H., 2008, Open Source GIS: A GRASS GIS Approach, 3rd Edition,
       Springer, New York, 406 pages.

       Talmi, A. and Gilat, G., 1977 : Method  for  Smooth  Approximation  of  Data,  Journal  of
       Computational Physics, 23, p.93-123.

       Wahba,  G.,  1990,  :  Spline  Models for Observational Data, CNMS-NSF Regional Conference
       series in applied mathematics, 59, SIAM, Philadelphia, Pennsylvania.

       Last changed: $Date: 2011-11-08 03:29:50 -0800 (Tue, 08 Nov 2011) $

       Full index

       © 2003-2013 GRASS Development Team