Provided by: pktools_2.6.6-1_amd64 

NAME
pkextract - extract pixel values from raster image from a (vector or raster) sample
SYNOPSIS
pkextract -i input [-s sample | -rand number | -grid size] -o output [options] [advanced options]
DESCRIPTION
pkextract extracts pixel values from an input raster dataset, based on the locations you provide via a
sample file. Alternatively, a random sample or systematic grid of points can also be extracted. The
sample can be a vector file with points or polygons. In the case of polygons, you can either extract the
values for all raster pixels that are covered by the polygons, or extract a single value for each polygon
such as the centroid, mean, median, etc. As output, a new copy of the vector file is created with an ex‐
tra attribute for the extracted pixel value. For each raster band in the input image, a separate at‐
tribute is created. For instance, if the raster dataset contains three bands, three attributes are cre‐
ated (b0, b1 and b2).
Instead of a vector dataset, the sample can also be a raster dataset with categorical values. The typi‐
cal use case is a land cover map that overlaps the input raster dataset. The utility then extracts pix‐
els from the input raster for the respective land cover classes. To select a random subset of the sample
raster dataset you can set the threshold option -t with a percentage value.
A typical usage of pkextract is to prepare a training sample for one of the classifiers implemented in
pktools.
Overview of the possible extraction rules:
Extraction rule Output features
point Extract all pixel values covered by
the polygon (option -polygon not set)
or extract a pixel on the surface op‐
tion (-polygon set).
centroid Extract pixel value at the centroid
of the polygon.
mean Extract average of all pixel values
within the polygon.
stdev Extract standard deviation of all
pixel values within the polygon.
median Extract median of all pixel values
within the polygon.
min Extract minimum value of all pixels
within the polygon.
max Extract maximum value of all pixels
within the polygon.
sum Extract sum of the values of all pix‐
els within the polygon.
mode Extract the mode of classes within
the polygon (classes must be set with
the option class).
proportion Extract proportion of class(es) with‐
in the polygon (classes must be set
with the option class).
count Extract count of class(es) within the
polygon (classes must be set with the
option class).
percentile Extract percentile as defined by op‐
tion perc (e.g, 95th percentile of
values covered by polygon).
OPTIONS
-i filename, --input filename
Raster input dataset containing band information
-s sample, --sample sample
OGR vector dataset with features to be extracted from input data. Output will contain features
with input band information included. Sample image can also be GDAL raster dataset.
-rand number, --random number
Create simple random sample of points. Provide number of points to generate
-grid size, --grid size
Create systematic grid of points. Provide cell grid size (in projected units, e.g,. m)
-o filename, --output filename
Output sample dataset Output sample dataset
-ln layer, --ln layer
Layer name(s) in sample (leave empty to select all)
-c class, --class class
Class(es) to extract from input sample image. Leave empty to extract all valid data pixels from
sample dataset. Make sure to set classes if rule is set to mode, proportion or count.
-t threshold, --threshold threshold
Probability threshold for selecting samples (randomly). Provide probability in percentage (>0) or
absolute (<0). Use a single threshold for vector sample datasets. If using raster land cover
maps as a sample dataset, you can provide a threshold value for each class (e.g. -t 80 -t 60).
Use value 100 to select all pixels for selected class(es)
-f format, --f format
Output sample dataset format
-ft fieldType, --ftype fieldType
Field type (only Real or Integer)
-lt labelType, --ltype labelType
Label type: In16 or String
-polygon, --polygon
Create OGRPolygon as geometry instead of OGRPoint.
-b band, --band band
Band index(es) to extract. Leave empty to use all bands
-sband band, --startband band
Start band sequence number
-eband band, --endband band
End band sequence number
-r rule, --rule rule
Rule how to report image information per feature (only for vector sample). point (value at each
point or at centroid if polygon), centroid, mean, stdev, median, proportion, count, min, max,
mode, sum, percentile.
-v level, --verbose level
Verbose mode if > 0
Advanced options
-bndnodata band, --bndnodata band
Band(s) in input image to check if pixel is valid (used for srcnodata)
-srcnodata value, --srcnodata value
Invalid value(s) for input image
-tp threshold, --thresholdPolygon threshold
(absolute) threshold for selecting samples in each polygon
-test testSample, --test testSample
Test sample dataset (use this option in combination with threshold<100 to create a training (out‐
put) and test set
-bn attribute, --bname attribute
For single band input data, this extra attribute name will correspond to the raster values. For
multi-band input data, multiple attributes with this prefix will be added (e.g. b0, b1, b2, etc.)
-cn attribute, --cname attribute
Name of the class label in the output vector dataset
-geo value, --geo value
Use geo coordinates (set to 0 to use image coordinates)
-down value, --down value
Down sampling factor (for raster sample datasets only). Can be used to create grid points
-buf value, --buffer value
Buffer for calculating statistics for point features
-circ, --circular
Use a circular disc kernel buffer (for vector point sample datasets only, use in combination with
buffer option)
EXAMPLE
Using vector samples
Extract all points for all layers read in points.sqlite from input.tif. Create a new point vector
dataset named extracted.sqlite, where each point will contain an attribute for the individual input bands
in input.tif. Notice that the default vector format is Spatialite (.sqlite).
pkextract -i input.tif -s points.sqlite -o extracted.sqlite
Same example as above, but only extract the points for the layer in points.sqlite named "valid"
pkextract -i input.tif -s points.sqlite -ln valid -o extracted.sqlite
Extract points and write output in ESRI Shapefile format
pkextract -i input.tif -s points.shp -f "ESRI Shapefile" -o extracted.sqlite
Extract the standard deviation for each input band in a 3 by 3 window, centered around the points in the
sample vector dataset points.sqlite. The output vector dataset will contain polygon features defined by
the buffered points (3x3 window). Use the option -circ to define a circular buffer.
pkextract -i input.tif -s points.sqlite -o extracted.sqlite -r stdev -buf 3 -polygon
Extract all pixels from input.tif covered by the polygons in locations.sqlite. Each polygon can thus re‐
sult in multiple point features with attributes for each input band. Write the extracted points to a
point vector dataset training.sqlite.
pkextract -i input.tif -s polygons.sqlite -o training.sqlite -r point
Extract the first band from input.tif at the centroids of the polygons in vector dataset polygons.sqlite.
Assign the extracted point value to a new attribute of the polygon and write to the vector dataset ex‐
tracted.sqlite.
pkextract -i input.tif -b 0 -s polygons.sqlite -r centroid -o extracted.sqlite -polygon
Extract the mean values for the second band in input.tif covered by each polygon in polygons.sqlite. The
mean values are written to a copy of the polygons in output vector dataset extracted.sqlite
pkextract -i input.tif -b 1 -s polygons.sqlite -r mean -o extracted.sqlite -polygon
Extract the majority class in each polygon for the input land cover map. The land cover map contains
five valid classes, labeled 1-5. Other class values (e.g., labeled as 0) are not taken into account in
the voting.
pkextract -i landcover.tif -s polygons.sqlite -r maxvote -o majority.sqlite -polygon -c 1 -c 2 -c 3 -c 4 -c 5
Using random and grid samples
Extract 100 sample units following a simple random sampling design. For each sample unit, the median
value is extracted from the input raster dataset in a window of 3 by 3 pixels and written to an attribute
of the output vector dataset. The output vector dataset contains polygon features defined by the windows
centered at the random selected sample units.
pkextract -i input.tif -o random.sqlite -rand 100 -median -buf 3 -polygon
Extract points following a systematic grid with grid cell size of 100 m. Discard pixels that have a val‐
ue 0 in the input raster dataset.
pkextract -i input.tif -o systematic.sqlite -grid 100 -srcnodata 0
Using raster samples
Typical use where pixels are extracted based on a land cover map (sample.tif). Extract all bands for a
random sample of 10 percent of the pixels in the land cover map sample.tif where the land cover classes
are either 1,2 or 3 (class values). Write output to the point vector dataset extracted.sqlite.
pkextract -i input.tif -s sample.tif -o extracted.sqlite -t 10 -c 1 -c 2 -c 3
Extract all bands for the first 5000 pixels encountered in sample.tif where pixels have a value equal to
1. Write output to point vector dataset extracted.sqlite.
pkextract -i input.tif -s sample.tif -o extracted.sqlite -t -5000 -c 1
24 January 2016 pkextract(1)