Provided by: mlpack-bin_2.2.5-1build1_amd64 

NAME
mlpack_lsh - all k-approximate-nearest-neighbor search with lsh
SYNOPSIS
mlpack_lsh [-h] [-v]
DESCRIPTION
This program will calculate the k approximate-nearest-neighbors of a set of points using locality-
sensitive hashing. You may specify a separate set of reference points and query points, or just a
reference set which will be used as both the reference and query set.
For example, the following will return 5 neighbors from the data for each point in 'input.csv' and store
the distances in 'distances.csv' and the neighbors in the file 'neighbors.csv':
$ lsh -k 5 -r input.csv -d distances.csv -n neighbors.csv
The output files are organized such that row i and column j in the neighbors output file corresponds to
the index of the point in the reference set which is the i'th nearest neighbor from the point in the
query set with index j. Row i and column j in the distances output file corresponds to the distance
between those two points.
Because this is approximate-nearest-neighbors search, results may be different from run to run. Thus, the
--seed option can be specified to set the random seed.
OPTIONAL INPUT OPTIONS
--bucket_size (-B) [int]
The size of a bucket in the second level hash. Default value 500.
--hash_width (-H) [double]
The hash width for the first-level hashing in the LSH preprocessing. By default, the LSH class
automatically estimates a hash width for its use. Default value 0.
--help (-h)
Default help info.
--info [string]
Get help on a specific module or option. Default value ''. --input_model_file (-m) [string] File
to load LSH model from. (Cannot be specified with --reference_file.) Default value ’'.
--k (-k) [int]
Number of nearest neighbors to find. Default value 0.
--num_probes (-T) [int]
Number of additional probes for multiprobe LSH; if 0, traditional LSH is used. Default value
0.
--projections (-K) [int]
The number of hash functions for each table Default value 10.
--query_file (-q) [string]
File containing query points (optional). Default value ''. --reference_file (-r) [string] File
containing the reference dataset. Default value ''. --second_hash_size (-S) [int] The size of the
second level hash table. Default value 99901.
--seed (-s) [int]
Random seed. If 0, 'std::time(NULL)' is used. Default value 0.
--tables (-L) [int]
The number of hash tables to be used. Default value 30. --true_neighbors_file (-t) [string] File
of true neighbors to compute recall with (the recall is printed when -v is specified). Default
value ''.
--verbose (-v)
Display informational messages and the full list of parameters and timers at the end of execution.
--version (-V)
Display the version of mlpack.
OPTIONAL OUTPUT OPTIONS
--distances_file (-d) [string] File to output distances into. Default value ’'. --neighbors_file (-n)
[string] File to output neighbors into. Default value ’'. --output_model_file (-M) [string] File to save
LSH model to. Default value ''.
ADDITIONAL INFORMATION
ADDITIONAL INFORMATION
For further information, including relevant papers, citations, and theory, For further information,
including relevant papers, citations, and theory, consult the documentation found at
http://www.mlpack.org or included with your consult the documentation found at http://www.mlpack.org or
included with your DISTRIBUTION OF MLPACK. DISTRIBUTION OF MLPACK.
mlpack_lsh(16 November 2017)