Provided by: osmium-tool_1.15.0-1_amd64 bug

NAME

       osmium-extract - create geographical extracts from an OSM file

SYNOPSIS

       osmium extract --config CONFIG-FILE [OPTIONS] OSM-FILE
       osmium extract --bbox LEFT,BOTTOM,RIGHT,TOP [OPTIONS] OSM-FILE
       osmium extract --polygon POLYGON-FILE [OPTIONS] OSM-FILE

DESCRIPTION

       Create  geographical  extracts  from  an OSM data file or an OSM history file.  The region
       (geographical extent) can be given as a bounding box or as a (multi)polygon.

       There are three ways of calling this command:

       • Specify a config file with the --config/-c option.  It can define any number of  regions
         you want to cut out.  See the CONFIG FILE section for details.

       • Specify a bounding box to cut out with the --bbox/-b option.

       • Specify a (multi)polygon to cut out with the --polygon/-p option.

       The  input  file is assumed to be ordered in the usual order: nodes first, then ways, then
       relations.

       If the --with-history/-H option is used, the  command  will  work  correctly  for  history
       files.   This  currently  works  for the complete_ways strategy only.  The simple or smart
       strategies do not work with history files.  A history extract will contain  every  version
       of  all  objects with at least one version in the region.  Generating a history extract is
       somewhat slower than a normal data extract.

       Osmium will make sure that all nodes on the vertices of the boundary of the region will be
       in  the  extract,  but nodes that happen to be directly on the boundary, but between those
       vertices, might end up in the extract or not.  In almost  all  cases  this  will  be  good
       enough,  but if you want to make really sure you got everything, use a small buffer around
       your region.

       By default no bounds will be set in the header of the output file.  Use  the  --set-bounds
       option if you need this.

       Note  that  osmium  extract  will never clip any OSM objects, ie.  it will not remove node
       references outside the region from ways or unused relation members from  relations.   This
       means  you  might  get objects that are not reference-complete.  It has the advantage that
       you can use osmium merge to merge several extracts without problems.

OPTIONS

       -b, --bbox=LONG1,LAT1,LONG2,LAT2
              Set the bounding box to cut out.  Can not be used with  --polygon/-p,  --config/-c,
              or  --directory/-d.   The coordinates LONG1,LAT1 are from one arbitrary corner, the
              coordinates LONG2,LAT2 are from the opposite corner.

       -c, --config=FILE
              Set the name  of  the  config  file.   Can  not  be  used  with  the  --bbox/-b  or
              --polygon/-p  option.   If  this  is  set,  the  --output/-o and --output-format/-f
              options are ignored, because they are set in the config file.

       --clean=ATTR
              Clean the attribute (version, timestamp,  changeset,  uid,  user),  from  the  data
              before  writing it out again.  The attribute will be set to 0 (the user will be set
              to the empty string).  This option can be given multiple times.  Depending  on  the
              output format these attributes might show up as 0 or not show up at all.

       -d, --directory=DIRECTORY
              Output  directory.   Output  file  names  in  the  config file are relative to this
              directory.  Overwrites the setting of the same  name  in  the  config  file.   This
              option  is  ignored  when  the  --bbox/-b or --polygon/-p options are used, set the
              output directory and name with the --output/-o option in that case.

       -H, --with-history
              Specify that the input file is a history file.  The output  file(s)  will  also  be
              history file(s).

       -p, --polygon=POLYGON_FILE
              Set the polygon to cut out based on the contents of the file.  The file has to be a
              GeoJSON, poly, or OSM file as described in the (MULTI)POLYGON FILE FORMATS section.
              It  has  to  have  the right suffix to be detected correctly.  Can not be used with
              --bbox/-b, --config/-c, or --directory/-d.

       -s, --strategy=STRATEGY
              Use the given strategy to extract the region.  For possible values and details  see
              the STRATEGIES section.  Default is “complete_ways”.

       -S, --option=OPTION=VALUE
              Set  a  named  option  for  the  strategy.   If  needed you can specify this option
              multiple times to set several options.

       --set-bounds
              Set the bounds field in the header.  The bounds are set to the bbox or envelope  of
              the  polygon  specified  for the extract.  Note that strategies other than “simple”
              can put nodes outside those bounds into the output file.

COMMON OPTIONS

       -h, --help
              Show usage help.

       -v, --verbose
              Set verbose mode.  The program will output information about what it  is  doing  to
              STDERR.

INPUT OPTIONS

       -F, --input-format=FORMAT
              The  format  of the input file(s).  Can be used to set the input format if it can’t
              be autodetected from the file name(s).  This will set  the  format  for  all  input
              files,  there  is  no way to set the format for some input files only.  See osmium-
              file-formats(5) or the libosmium manual for details.

OUTPUT OPTIONS

       -f, --output-format=FORMAT
              The format of the output file.  Can be used to set the output  file  format  if  it
              can’t be autodetected from the output file name.  See osmium-file-formats(5) or the
              libosmium manual for details.

       --fsync
              Call fsync after writing the output file to force flushing buffers to disk.

       --generator=NAME
              The name and version of the program generating the output file.  It will  be  added
              to the header of the output file.  Default is “osmium/” and the version of osmium.

       -o, --output=FILE
              Name of the output file.  Default is `-' (STDOUT).

       -O, --overwrite
              Allow  an  existing  output file to be overwritten.  Normally osmium will refuse to
              write over an existing file.

       --output-header=OPTION=VALUE
              Add output header option.  This command line option can be used multiple times  for
              different  OPTIONs.   See  the  osmium-output-headers(5)  man  page  for  a list of
              available header options.  For  some  commands  you  can  use  the  special  format
              “OPTION!”  (ie.   an exclamation mark after the OPTION and no value set) to set the
              value to the same as in the input file.

CONFIG FILE

       The config file mainly specifies the file names and  the  regions  of  the  extracts  that
       should be created.

       The  config file is in JSON format.  The top-level is an object which contains at least an
       “extracts” array.  It can also contain a “directory” entry which names the directory where
       all the output files will be created:

              {
                  "extracts": [...],
                  "directory": "/tmp/"
              }

       The  extracts array specifies the extracts that should be created.  Each item in the array
       is an object with at least a name “output” naming the output file and a region defined  in
       a  “bbox”,  “polygon”  or “multipolygon” name.  An optional “description” can be added, it
       will not be used by the program but can help with documenting the file contents.  You  can
       add  an  optional “output_format” if the format can not be detected from the “output” file
       name.  Run “osmium help file-formats” to get a description of allowed formats.

       The optional “output_header” allows you to set additional OSM file header settings such as
       the “generator”.  If you set the value of a file header setting to null, the output header
       will be set to the same header from the input file.

              "extracts": [
                  {
                      "output": "hamburg.osm.pbf",
                      "output_format": "pbf",
                      "description": "optional description",
                      "bbox": ...
                  },
                  {
                      "output": "berlin.osm.pbf",
                      "description": "optional description",
                      "polygon": ...
                  },
                  {
                      "output": "munich.osm.pbf",
                      "output_header": {
                          "generator": "MyExtractor/1.0",
                          "osmosis_replication_timestamp": null
                      },
                      "description": "optional description",
                      "multipolygon": ...
                  }
              ]

       There are several formats for specifying the regions:

       bbox:

       A bounding box in one of two formats.  The first is a simple array with four real numbers,
       the first two specifying the coordinates of an arbitrary corner, the second two specifying
       the coordinates of the opposite corner.

              {
                  "output": "munich.osm.pbf",
                  "description": "Bounding box specified in array format",
                  "bbox": [11.35, 48.05, 11.73, 48.25]
              }

       The second format uses an object instead of an array:

              {
                  "output": "dresden.osm.pbf",
                  "description": "Bounding box specified in object format",
                  "bbox": {
                      "left": 13.57,
                      "right": 13.97,
                      "top": 51.18,
                      "bottom": 50.97
                  }
              }

       polygon:

       A polygon, either specified inline in the config file or read from an external file.   See
       the (MULTI)POLYGON FILE FORMATS section for external files.  If specified inline this is a
       nested array, the outer array defining the polygon, the  next  array  the  rings  and  the
       innermost arrays the coordinates.  This format is the same as in GeoJSON files.

       In this example there is only one outer ring:

              "polygon": [[
                  [9.613465, 53.58071],
                  [9.647599, 53.59655],
                  [9.649288, 53.61059],
                  [9.613465, 53.58071]
              ]]

       In each ring, the last set of coordinates should be the same as the first set, closing the
       ring.

       multipolygon:

       A multipolygon, either specified inline in the config file or read from an external  file.
       See  the (MULTI)POLYGON FILE FORMATS section for external files.  If specified inline this
       is a nested array, the outer array defining the multipolygon, the next array the polygons,
       the  next  the rings and the innermost arrays the coordinates.  This format is the same as
       in GeoJSON files.

       In this example there is one outer and one inner ring:

              "multipolygon": [[[
                  [6.847, 50.987],
                  [6.910, 51.007],
                  [7.037, 50.953],
                  [6.967, 50.880],
                  [6.842, 50.925],
                  [6.847, 50.987]
              ],[
                  [6.967, 50.954],
                  [6.969, 50.920],
                  [6.932, 50.928],
                  [6.934, 50.950],
                  [6.967, 50.954]
              ]]]

       In each ring, the last set of coordinates should be the same as the first set, closing the
       ring.

       Osmium  must  check  each  and every node in the input data and find out in which bounding
       boxes or (multi)polygons this node is.  This is very cheap for bounding  boxes,  but  more
       expensive  for  (multi)polygons.   And  it  becomes  more  expensive the more vertices the
       (multi)polyon has.  Use bounding boxes or simplified polygons where possible.

       Note that bounding boxes or (multi)polygons are not allowed to span  the  -180/180  degree
       line.  If you need this, cut out the regions on each side and use osmium merge to join the
       resulting files.

(MULTI)POLYGON FILE FORMATS

       External files describing a (multi)polygon are specified in  the  config  file  using  the
       “file_name” and “file_type” properties on the “polygon” or “multipolygon” object:

              "polygon": {
                  "file_name": "berlin.geojson",
                  "file_type": "geojson"
              }

       If file names don’t start with a slash (/), they are interpreted relative to the directory
       where the config file is.  If the “file_type” is missing, Osmium will try to autodetect it
       from the suffix of the “file_name”.

       The following file types are supported:

       geojson
              GeoJSON  file  containing exactly one Feature of type Polygon or MultiPolygon, or a
              FeatureCollection  with  the  first  Feature  of  type  Polygon  or   MultiPolygon.
              Everything except the actual geometry (of the first Feature) is ignored.

       poly   A             poly            file            as            described            in
              https://wiki.openstreetmap.org/wiki/Osmosis/Polygon_Filter_File_Format .  This wiki
              page also mentions several sources for such poly files.

       osm    An  OSM file containing one or more multipolygon or boundary relation together with
              all the nodes and ways needed.  Any OSM file format (XML, PBF, ...)   supported  by
              Osmium can be used here, but the correct suffix must be used, so the file format is
              detected correctly.  Files for this can easily be obtained  by  searching  for  the
              area   on   OSM   and   then  downloading  the  full  relation  using  a  URL  like
              https://www.openstreetmap.org/api/0.6/relation/RELATION-ID/full .  Or you  can  use
              osmium  getid -r to get a specific relation from an OSM file.  Note that both these
              approaches can get you very detailed boundaries which can take quite a while to cut
              out.  Consider simplifying the boundary before use.

       If there are several (multi)polygons in a poly file or OSM file, they will be merged.  The
       (multi)polygons must not overlap, otherwise the result is undefined.

STRATEGIES

       osmium extract can use different strategies for creating the extracts.  Depending  on  the
       strategy different objects will end up in the extracts.  The strategies differ in how much
       memory they need and how often they need to read the input file.  The choice  of  strategy
       depends  on  how  you  want to use the generated extracts and how much memory and time you
       have.

       The default strategy is complete_ways.

       Strategy simple
              Runs in a single pass.  The extract will contain all nodes inside  the  region  and
              all  ways referencing those nodes as well as all relations referencing any nodes or
              ways already included.  Ways crossing the region boundary will  not  be  reference-
              complete.   Relations  will  not  be  reference-complete.   This  strategy is fast,
              because it reads the input only once, but the result is not  enough  for  most  use
              cases.   It is the only strategy that will work when reading from a socket or pipe.
              This strategy will not work for history files.

       Strategy complete_ways
              Runs in two passes.  The extract will contain all nodes inside the region  and  all
              ways  referencing  those  nodes as well as all nodes referenced by those ways.  The
              extract will also contain all relations referenced by nodes inside  the  region  or
              ways  already  included  and,  recursively,  their  parent relations.  The ways are
              reference-complete, but the relations are not.

       Strategy smart
              Runs in three passes.  The extract will contain all nodes inside the region and all
              ways  referencing  those  nodes as well as all nodes referenced by those ways.  The
              extract will also contain all relations referenced by nodes inside  the  region  or
              ways  already  included and, recursively, their parent relations.  The extract will
              also contain all nodes and ways  (and  the  nodes  they  reference)  referenced  by
              relations  tagged  “type=multipolygon” directly referencing any nodes in the region
              or ways referencing nodes in the region.  The ways are reference-complete, and  all
              multipolygon  relations referencing nodes in the regions or ways that have nodes in
              the region are reference-complete.  Other relations are not reference-complete.

       For the complete_ways strategy you can set the option “-S relations=false” in  which  case
       no relations will be written to the output file.

       For  the  smart  strategy  you can change the types of relations that should be reference-
       complete.  Instead of just relations tagged “type=multipolygon”, you can  either  get  all
       relations  (use  “-S  types=any”)  or  give  a  list  of  types  to  the  -S  option:  “-S
       types=multipolygon,route”.  Note that especially boundary relations can be huge, so if you
       include them, be aware your result might be huge.

       The  smart  strategy  allows another option “-S complete-partial-relations=X”.  If this is
       set, all relations that have more than X percent of their members already in  the  extract
       will  have  their  full  set  of members in the extract.  So this allows completing almost
       complete relations.  It can be useful for instance to make sure  a  boundary  relation  is
       complete even if some of it is outside the polygon used for extraction.

DIAGNOSTICS

       osmium extract exits with exit code

       0      if everything went alright,

       1      if there was an error processing the data, or

       2      if  there  was  a  problem  with the command line arguments, config file or polygon
              files.

MEMORY USAGE

       Memory usage of osmium extract depends on the number of extracts and on the strategy used.
       For  the simple strategy it will at least be the number of extracts times the highest node
       ID used divided by 8.  For the complete_ways twice that and for the smart strategy  a  bit
       more.

       If  you  want  to  split a large file into many extracts, do this in several steps.  First
       create several larger extracts and then split them again and again into smaller pieces.

EXAMPLES

       See the example config files in the extract-example-config directory.  To try it:

              osmium extract -v -c extract-example-config/extracts.json \
                  germany-latest.osm.pbf

       Extract the city of Karlsruhe using a boundary polygon:

              osmium extract -p karlsruhe-boundary.osm.bz2 germany-latest.osm.pbf \
                  -o karlsruhe.osm.pbf

       Extract the city of Munich using a bounding box:

              osmium extract -b 11.35,48.05,11.73,48.25 germany-latest.osm.pbf \
                  -o munich.osm.pbf

SEE ALSO

osmium(1), osmium-file-formats(5),  osmium-output-headers(5),  osmium-getid(1),  osmium-
         merge(1)

       • Osmium website (https://osmcode.org/osmium-tool/)

COPYRIGHT

       Copyright (C) 2013-2023 Jochen Topf <jochen@topf.org>.

       License  GPLv3+:  GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.  This is
       free software: you are free to change and redistribute it.  There is NO WARRANTY,  to  the
       extent permitted by law.

CONTACT

       If    you   have   any   questions   or   want   to   report   a   bug,   please   go   to
       https://osmcode.org/contact.html

AUTHORS

       Jochen Topf <jochen@topf.org>.

                                              1.15.0                            OSMIUM-EXTRACT(1)