Provided by: mkgmap-splitter_0.0.0+svn653-1_all bug

NAME

       mkgmap-splitter - tile splitter for mkgmap

SYNOPSIS

       mkgmap-splitter [options] file.osm > splitter.log

DESCRIPTION

       mkgmap-splitter  splits an .osm file that contains large well mapped regions into a number
       of smaller tiles, to fit within the maximum size used for the Garmin maps  format.   There
       are at least two stages of processing required.  The first stage is to calculate what area
       each tile should cover, based on the distribution of nodes.  The second stage  writes  out
       the  nodes,  ways  and  relations  from  the original .osm file into separate smaller .osm
       files,  one  for  each  area  that   was   calculated   in   stage   one.    With   option
       --keep-complete=true, two additional stages are used to avoid broken ways and polygons.

       The two most important features are:

       • Variable sized tiles to prevent a large number of tiny files.

       • Tiles join exactly with no overlap or gaps.

       You will need a lot of memory on your computer if you intend to split a large area.  A few
       options allow configuring how much memory you need.  With the default parameters, you need
       about 4-5 bytes for every node and way.  This doesn't sound a lot but there are about 1700
       million nodes in the whole planet file and so you cannot process the whole planet  in  one
       pass file on a 32 bit machine using this utility as the maximum java heap space is 2G.  It
       is possible with 64 bit java and about 7GB of heap or with multiple passes.

       The Europe extract from Cloudmade or Geofabrik can be processed within the 2G limit if you
       have  sufficient  memory.   With the default options europe is split into about 750 tiles.
       The Europe extract is about half of the size of the complete planet file.

       On the other hand a single country, even a well mapped one such as Germany or the UK, will
       be possible on a modest machine, even a netbook.

USAGE

       Splitter requires java 1.6 or higher.  Basic usage is as follows.

       mkgmap-splitter file.osm > splitter.log

       If you have less than 2 GB of memory on your computer you should reduce the -Xmx option by
       setting the JAVA_OPTS environment variable.

       JAVA_OPTS="-Xmx512m" mkgmap-splitter file.osm > splitter.log

       This will produce a number of .osm.pbf files that can be read  by  mkgmap(1).   There  are
       also other files produced:

       The  template.args  file is a file that can be used with the -c option of mkgmap that will
       compile all the files.  You can use it as is or you can copy it and  edit  it  to  include
       your own options.  For example instead of each description being "OSM Map" it could be "NW
       Scotland" as appropriate.

       The areas.list file is the list of bounding boxes that were calculated.  If you  want  you
       can  use  this  on a subsequent call the the splitter using the --split-file option to use
       exactly the same areas as last time.  This might be useful if you produce a map  regularly
       and  want to keep the tile areas the same from month to month.  It is also useful to avoid
       the time it takes to regenerate the file each time (currently about a third of the overall
       time taken to perform the split).  Of course if the map grows enough that one of the tiles
       overflows you will have to re-calculate the areas again.

       The areas.poly file contains the bounding polygon of the  calculated  areas.   See  option
       --polygon-file how this can be used.

       The  densities-out.txt  file is written when no split-file is given and contains debugging
       information only.

       You can also use a gzip'ed or bz2'ed compressed .osm file as the input  file.   Note  that
       this  can  slow  down  the  splitter  considerably  (particularly  true  for  bz2) because
       decompressing the .osm file can take quite a lot of CPU power.  If you are  likely  to  be
       processing  a  file several times you're probably better off converting the file to one of
       the binary formats pbf or o5m.  The o5m format is faster to read, but requires more  space
       on the disk.

OPTIONS

       There are a number of options to fine tune things that you might want to try.

       --boundary-tags=string
              A  comma  separated  list of tag values for relations.  Used to filter multipolygon
              and  boundary   relations   for   problem-list   processing.    See   also   option
              --wanted-admin-level.  Default: use-exclude-list

       --cache=string
              Deprecated, now does nothing

       --description=string
              Sets the desciption to be written in to the template.args file.

       --geonames-file=string
              The  name  of  a  GeoNames  file  to  use  for  determining  tile names.  Typically
              cities15000.zip from geonames ⟨http://download.geonames.org/export/dump⟩ .

       --keep-complete=boolean
              Use --keep-complete=false to disable two  additional  program  phases  between  the
              split  and the final distribution phase (not recommended).  The first phase, called
              gen-problem-list, detects all ways and relations that are crossing the  borders  of
              one  or  more output files.  The second phase, called handle-problem-list, collects
              the coordinates of these ways and relations and calculates all  output  files  that
              are  crossed  or  enclosed.   The  information is passed to the final dist-phase in
              three temporary files.  This avoids broken polygons, but be aware that it  requires
              to read the input files at least two additional times.

              Do not specify it with --overlap unless you have a good reason to do so.

              Defaulte: true

       --mapid=int
              Set the filename for the split files.  In the example the first file will be called
              63240001.osm.pbf and the next one will be 63240002.osm.pbf and so on.

              Default: 63240001

       --max-areas=int
              The maximum number of areas that can be processed  in  a  single  pass  during  the
              second  stage of processing.  This must be a number from 1 to 4096.  Higher numbers
              mean fewer passes over the source file and hence quicker  overall  processing,  but
              also require more memory.  If you find you are running out of memory but don't want
              to increase your --max-nodes value, try reducing this instead.  Changing this  will
              have  no effect on the result of the split, it's purely to let you trade off memory
              for performance.  Note that the first stage of the processing has  a  fixed  memory
              overhead  regardless  of  what  this  is set to so if you are running out of memory
              before the areas.list file is generated, you need  to  either  increase  your  -Xmx
              value or reduce the size of the input file you're trying to split.

              Default: 512

       --max-nodes=int
              The maximum number of nodes that can be in any of the resultant files.  The default
              is fairly conservative, you could increase it quite a lot before getting  any  'map
              too  big'  messages.  Not much experimentation has been done.  Also the bigger this
              value, the less memory is required during the splitting stage.

              Default: 1600000

       --max-threads=value
              The maximum number of threads used by mkgmap-splitter.

              Default: 4 (auto)

       --mixed=boolean
              Specify this if the input osm file has nodes, ways and  relations  intermingled  or
              the  ids  are  not  strictly sorted.  To increase performance, use the osmosis sort
              function.

              Default: false

       --no-trim=boolean
              Don't trim empty space off the  edges  of  tiles.   This  option  is  ignored  when
              --polygon-file is used.

              Default: false

       --num-tiles=valuestring
              A target value that is used when no split-file is given.  Splitting is done so that
              the given number of tiles is produced.  The --max-nodes value is  ignored  if  this
              option is given.

       --output=string
              The  format  in  which the output files are written.  Possible values are xml, pbf,
              o5m, and simulate.  The default is pbf, which produces  the  smallest  file  sizes.
              The  o5m  format  is  faster  to  write,  but creates around 40% larger files.  The
              simulate option is for debugging purposes.

       --output-dir=path
              The directory to which splitter should write the output files.   If  the  specified
              path to a directory doesn't exist, mkgmap-splitter tries to create it.  Defaults to
              the current working directory.

       --overlap=string
              Deprecated since r279.  With --keep-complete=false, mkgmap-splitter should  include
              nodes  outside  the  bounding  box,  so  that mkgmap can neatly crop exactly at the
              border.  This parameter controls the size of that overlap.  It is in map  units,  a
              default  of  2000  is used which means about 0.04 degrees of latitude or longitude.
              If --keep-complete=true is active and --overlap is given, a warning will be printed
              because this combination rarely makes sense.

       --polygon-desc-file=path
              An osm file (.o5m, .pbf, .osm) with named ways that describe bounding polygons with
              OSM ways having tags name and mapid.

       --polygon-file=path
              The name of a file containing a bounding polygon in the osmosis polygon file format
              .   mkgmap-splitter uses this file when calculating the areas.  It first calculates
              a grid using the given --resolution.  The input file is read and for each  node,  a
              counter  is  increased  for  the  related  grid area.  If the input file contains a
              bounding box, this is applied to the grid so that nodes outside of the bounding box
              are  ignored.   Next, if specified, the bounding polygon is used to zero those grid
              elements outside of the bounding polygon area.  If the polygon area(s)  describe(s)
              a  rectilinear  area  with  no  more  than 40 vertices, mkgmap-splitter will try to
              create output files that fit exactly into the area, otherwise it  will  approximate
              the polygon area with rectangles.

       --precomp-sea=path
              The  name  of  a  directory  containing  precompiled  sea tiles.  If given, mkgmap-
              splitter will use the precompiled sea tiles in the same way as  mkgmap  does.   Use
              this  if  you want to use a polygon-file or --no-trim=true and mkgmap creates empty
              *.img files combined with a message starting "There is not enough room in a  single
              garmin map for all the input data".

       --problem-file=path
              The  name  of a file containing ways and relations that are known to cause problems
              in the split process.  Use this option if --keep-complete requires too much time or
              memory and --overlap doesn't solve your problem.

              Syntax of problem file:

              way:<id> # comment...
              rel:<id> # comment...

              example:

              way:2784765 # Ferry Guernsey - Jersey

       --problem-report=path
              The   name   of   a   file  to  write  the  generated  problem  list  created  with
              --keep-complete.  The parameter is ignored if --keep-complete=false.  You can reuse
              this  file  with the --problem-file parameter, but do this only if you use the same
              values for --max-nodes and --resolution.

       --resolution=int
              The resolution of the density map produced during the first phase.  A value between
              1  and  24.   Default  is  13.  Increasing the value to 14 requires four times more
              memory in the split phase.  The value is ignored if a --split-file is given.

       --search-limit=int
              Search limit in split algo.  Higher values may find better splits,  but  will  take
              longer.

              Default: 200000

       --split-file=path
              Use  the previously calculated tile areas instead of calculating them from scratch.
              The file can be in .list or .kml format.

       --status-freq=int
              Displays the amount of memory used by the JVM every --status-freq seconds.  Set  =0
              to disable.

              Default: 120

       --stop-after=string
              Debugging:  stop  after  a given program phase.  Can be split, gen-problem-list, or
              handle-problem-list.  Default is dist which means execute all phases.

       --wanted-admin-level=string
              Specifies the lowest admin_level value of boundary relations that  should  be  kept
              complete.  Used  to  filter  boundary  relations  for  problem-list processing. The
              default  value  5  means  that  boundary  relations  are  kept  complete  when  the
              admin_level   is   5   or   higher   (5..11).    The   parameter   is   ignored  if
              --keep-complete=false.  Default: 5

       --write-kml=path
              The name of a kml file to  write  out  the  areas  to.   This  is  in  addition  to
              areas.list (which is always written out).

       Special options

       --version
              If  the parameter --version is found somewhere in the options, mkgmap-splitter will
              just print the version info and exit.  Version info looks like this:

              splitter 279 compiled 2013-01-12T01:45:02+0000

       --help If the parameter --help is found somewhere in  the  options,  mkgmap-splitter  will
              print a list of all known normal options together with a short help and exit.

TUNING

       Tuning for best performance

       A few hints for those that are using mkgmap-splitter to split large files.

       • For  faster  processing  with --keep-complete=true, convert the input file to o5m format
         using:

         osmconvert --drop-version file.osm -o=file.o5m

       • The option --drop-version is optional, it reduces the file to that data that  is  needed
         by mkgmap-splitter and mkgmap.

       • If  you  still experience poor performance, look into splitter.log.  Search for the word
         Distributing.  You may find something like this in the next line:

         Processing 1502 areas in 3 passes, 501 areas at a time

         This means splitter has to read the input file input three times because the --max-areas
         parameter  was  much  smaller  than  the  number of areas.  If you have enough heap, set
         --max-areas  value  to  a  value  that  is  higher  than  the  number  of  areas,   e.g.
         --max-areas=2048.  Execute mkgmap-splitter again and you should find

         Processing 1502 areas in a single pass

       • More  areas  require  more  memory.   Make  sure  that  mkgmap-splitter  has enough heap
         (increase the -Xmx parameter) so  that  it  doesn't  waste  much  time  in  the  garbage
         collector (GC), but keep as much memory as possible for the systems I/O caches.

       • If available, use two different disks for input file and output directory, esp. when you
         use o5m format for input and output.

       • If you use mkgmap r2415 or  later  and  disk  space  is  no  concern,  consider  to  use
         --output=o5m to speed up processing.

       Tuning for low memory requirements

       If  your machine has less than 1 GB free memory (eg. a netbook), you can still use mkgmap-
       splitter, but you might have to be patient if you use the  parameter  --keep-complete  and
       want  to  split a file like germany.osm.pbf or a larger one.  If needed, reduce the number
       of parallel processed areas to 50  with  the  --max-areas  parameter.   You  have  to  use
       --keep-complete=false when splitting an area like Europe.

NOTES

       • There  is no longer an upper limit on the number of areas that can be output (previously
         it was 255).  More areas just mean potentially more passes being required over the  .osm
         file, and hence the splitter will take longer to run.

       • There is no longer a limit on how many areas a way or relation can belong to (previously
         it was 4).

SEE ALSO

       mkgmap(1), osmconvert(1)

                                         01 January 2023                       mkgmap-splitter(1)