lunar (1) mlpack_cf.1.gz

Provided by: mlpack-bin_3.4.2-7ubuntu1_amd64 bug

NAME

       mlpack_cf - collaborative filtering

SYNOPSIS

        mlpack_cf [-a string] [-A bool] [-m unknown] [-i string] [-I bool] [-N int] [-r double] [-S string] [-n int] [-z string] [-q string] [-R int] [-c int] [-s int] [-T string] [-t string] [-V bool] [-o string] [-M unknown] [-h -v]

DESCRIPTION

       This  program  performs collaborative filtering (CF) on the given dataset. Given a list of
       user, item and preferences  (the  '--training_file  (-t)'  parameter),  the  program  will
       perform  a  matrix  decomposition  and  then  can  perform  a series of actions related to
       collaborative filtering. Alternately, the program can load an existing saved CF model with
       the '--input_model_file (-m)' parameter and then use that model to provide recommendations
       or predict values.

       The input matrix should be a 3-dimensional matrix of ratings, where the first dimension is
       the  user, the second dimension is the item, and the third dimension is that user's rating
       of that item. Both the users and items should be numeric indices, not names.  The  indices
       are assumed to start from 0.

       A  set of query users for which recommendations can be generated may be specified with the
       '--query_file (-q)' parameter; alternately, recommendations may  be  generated  for  every
       user  in  the  dataset  by  specifying the ’--all_user_recommendations (-A)' parameter. In
       addition, the number of recommendations per user to generate can  be  specified  with  the
       ’--recommendations  (-c)'  parameter,  and  the  number  of similar users (the size of the
       neighborhood) to be considered when generating recommendations can be specified  with  the
       '--neighborhood (-n)' parameter.

       For  performing  the  matrix  decomposition,  the following optimization algorithms can be
       specified via the '--algorithm (-a)' parameter:

              •  ’RegSVD' -- Regularized SVD using a SGD optimizer

              •  ’NMF' -- Non-negative matrix factorization with alternating least squares update
                 rules

              •  ’BatchSVD' -- SVD batch learning

              •  ’SVDIncompleteIncremental' -- SVD incomplete incremental learning

              •  ’SVDCompleteIncremental' -- SVD complete incremental learning

              •  ’BiasSVD' -- Bias SVD using a SGD optimizer

              •  ’SVDPP' -- SVD++ using a SGD optimizer

              The   following   neighbor   search   algorithms   can   be   specified   via   the
              ’--neighbor_search (-S)' parameter:

                     •  ’cosine' -- Cosine Search Algorithm

                     •  ’euclidean' -- Euclidean Search Algorithm

                     •  ’pearson' -- Pearson Search Algorithm

              The  following  weight  interpolation  algorithms  can   be   specified   via   the
              ’--interpolation (-i)' parameter:

                     •  ’average' -- Average Interpolation Algorithm

                     •  ’regression' -- Regression Interpolation Algorithm

                     •  ’similarity' -- Similarity Interpolation Algorithm

              The   following   ranking   normalization  algorithms  can  be  specified  via  the
              ’--normalization (-z)' parameter:

                     •  ’none' -- No Normalization

                     •  ’item_mean' -- Item Mean Normalization

                     •  ’overall_mean' -- Overall Mean Normalization

                     •  ’user_mean' -- User Mean Normalization

                     •  ’z_score' -- Z-Score Normalization

              A trained model  may  be  saved  to  with  the  '--output_model_file  (-M)'  output
              parameter.

              To train a CF model on a dataset 'training_set.csv' using NMF for decomposition and
              saving the trained model to 'model.bin', one could call:

              $ mlpack_cf --training_file training_set.csv  --algorithm  NMF  --output_model_file
              model.bin

              Then,  to  use  this model to generate recommendations for the list of users in the
              query set 'users.csv', storing  5  recommendations  in  'recommendations.csv',  one
              could call

              $ mlpack_cf --input_model_file model.bin --query_file users.csv --recommendations 5
              --output_file recommendations.csv

OPTIONAL INPUT OPTIONS

       --algorithm (-a) [string]
              Algorithm used for matrix factorization.  Default value 'NMF'.

       --all_user_recommendations (-A) [bool]
              Generate recommendations for all users.

       --help (-h) [bool]
              Default help info.

       --info [string]
              Print help on a specific option. Default value ''.

       --input_model_file (-m) [unknown]
              Trained CF model to load.

       --interpolation (-i) [string]
              Algorithm used for weight interpolation.  Default value 'average'.

       --iteration_only_termination (-I) [bool]
              Terminate only when the maximum number of iterations is reached.

       --max_iterations (-N) [int]
              Maximum number of iterations. If set to zero, there is no limit on  the  number  of
              iterations.  Default value 1000.

       --min_residue (-r) [double]
              Residue required to terminate the factorization (lower values generally mean better
              fits).  Default value 1e-05.

       --neighbor_search (-S) [string]
              Algorithm used for neighbor search. Default value 'euclidean'.

       --neighborhood (-n) [int]
              Size of the neighborhood of similar users to consider for each query user.  Default
              value 5.

       --normalization (-z) [string]
              Normalization performed on the ratings. Default value 'none'.

       --query_file (-q) [string]
              List of query users for which recommendations should be generated.

       --rank (-R) [int]
              Rank  of  decomposed  matrices  (if  0,  a heuristic is used to estimate the rank).
              Default value  0.   --recommendations  (-c)  [int]  Number  of  recommendations  to
              generate for each query user. Default value 5.

       --seed (-s) [int]
              Set the random seed (0 uses std::time(NULL)).  Default value 0.

       --test_file (-T) [string]
              Test set to calculate RMSE on.

       --training_file (-t) [string]
              Input dataset to perform CF on.

       --verbose (-v) [bool]
              Display  informational  messages  and the full list of parameters and timers at the
              end of execution.

       --version (-V) [bool]
              Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

       --output_file (-o) [string]
              Matrix that will store output recommendations.

       --output_model_file (-M) [unknown]
              Output for trained CF model.

ADDITIONAL INFORMATION

       For further information, including relevant papers, citations,  and  theory,  consult  the
       documentation found at http://www.mlpack.org or included with your distribution of mlpack.