lunar (1) mlpack_dbscan.1.gz

Provided by: mlpack-bin_3.4.2-7ubuntu1_amd64 bug

NAME

       mlpack_dbscan - dbscan clustering

SYNOPSIS

        mlpack_dbscan -i string [-e double] [-m int] [-N bool] [-s string] [-S bool] [-t string] [-V bool] [-a string] [-C string] [-h -v]

DESCRIPTION

       This  program  implements the DBSCAN algorithm for clustering using accelerated tree-based
       range search. The type of tree that is used may be  parameterized,  or  brute-force  range
       search may also be used.

       The input dataset to be clustered may be specified with the '--input_file (-i)' parameter;
       the radius of each range search may be specified with the ’--epsilon (-e)' parameters, and
       the  minimum  number  of  points  in a cluster may be specified with the '--min_size (-m)'
       parameter.

       The '--assignments_file (-a)' and '--centroids_file (-C)' output parameters may be used to
       save  the  output  of  the  clustering.  '--assignments_file  (-a)'  contains  the cluster
       assignments of each point, and '--centroids_file (-C)'  contains  the  centroids  of  each
       cluster.

       The  range search may be controlled with the '--tree_type (-t)', '--single_mode (-S)', and
       '--naive (-N)' parameters. '--tree_type (-t)' can control the type of tree used for  range
       search;  this  can  take  a  variety of values: 'kd', 'r', ’r-star', 'x', 'hilbert-r', 'r-
       plus', 'r-plus-plus', 'cover', 'ball'.  The  ’--single_mode  (-S)'  parameter  will  force
       single-tree  search (as opposed to the default dual-tree search), and ''--naive (-N)' will
       force brute-force range search.

       An example usage to run DBSCAN on the dataset in 'input.csv' with a radius of  0.5  and  a
       minimum cluster size of 5 is given below:

       $ mlpack_dbscan --input_file input.csv --epsilon 0.5 --min_size 5

REQUIRED INPUT OPTIONS

       --input_file (-i) [string]
              Input dataset to cluster.

OPTIONAL INPUT OPTIONS

       --epsilon (-e) [double]
              Radius of each range search. Default value 1.

       --help (-h) [bool]
              Default help info.

       --info [string]
              Print help on a specific option. Default value ''.

       --min_size (-m) [int]
              Minimum number of points for a cluster. Default value 5.

       --naive (-N) [bool]
              If set, brute-force range search (not tree-based) will be used.

       --selection_type (-s) [string]
              If  using  point  selection  policy,  the  type  of  selection  to  use ('ordered',
              'random'). Default value 'ordered'.

       --single_mode (-S) [bool]
              If set, single-tree range search (not dual-tree) will be used.

       --tree_type (-t) [string]
              If using single-tree or dual-tree search, the type of tree to use ('kd',  'r',  'r-
              star',  'x',  'hilbert-r', 'r-plus', 'r-plus-plus', 'cover', 'ball'). Default value
              'kd'.

       --verbose (-v) [bool]
              Display informational messages and the full list of parameters and  timers  at  the
              end of execution.

       --version (-V) [bool]
              Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

       --assignments_file (-a) [string]
              Output matrix for assignments of each point.

       --centroids_file (-C) [string]
              Matrix to save output centroids to.

ADDITIONAL INFORMATION

       For  further  information,  including  relevant papers, citations, and theory, consult the
       documentation found at http://www.mlpack.org or included with your distribution of mlpack.