lunar (1) tpot.1.gz

Provided by: python3-tpot_0.11.7+dfsg-5_all bug

NAME

       tpot - Automated Machine Learning tool

DESCRIPTION

       usage: tpot [-h] [-is INPUT_SEPARATOR] [-target TARGET_NAME]

              [-mode   {classification,regression}]   [-o   OUTPUT_FILE]   [-g  GENERATIONS]  [-p
              POPULATION_SIZE] [-os  OFFSPRING_SIZE]  [-mr  MUTATION_RATE]  [-xr  CROSSOVER_RATE]
              [-scoring   SCORING_FN]  [-cv  NUM_CV_FOLDS]  [-sub  SUBSAMPLE]  [-njobs  NUM_JOBS]
              [-maxtime  MAX_TIME_MINS]  [-maxeval  MAX_EVAL_MINS]  [-s  RANDOM_STATE]   [-config
              CONFIG_FILE]  [-template  TEMPLATE]  [-memory  MEMORY] [-cf CHECKPOINT_FOLDER] [-es
              EARLY_STOP] [-v {0,1,2,3}] [-log LOG] [--version] INPUT_FILE

       A Python tool that automatically creates and optimizes machine  learning  pipelines  using
       genetic programming.

   positional arguments:
       INPUT_FILE
              Data  file  to  use  in the TPOT optimization process.  Ensure that the class label
              column is labeled as "class".

   options:
       -h, --help
              Show this help message and exit.

       -is INPUT_SEPARATOR
              Character used to separate columns in the input file.

       -target TARGET_NAME
              Name of the target column in the input file.

       -mode {classification,regression}
              Whether TPOT is being used for a supervised classification or regression problem.

       -o OUTPUT_FILE
              File to export the code for the final optimized pipeline.

       -g GENERATIONS
              Number of iterations to run  the  pipeline  optimization  process.  It  must  be  a
              positive  number  or  None. If None, the parameter max_time_mins must be defined as
              the runtime limit.  Generally,  TPOT  will  work  better  when  you  give  it  more
              generations  (and  therefore  time)  to  optimize  the pipeline. TPOT will evaluate
              POPULATION_SIZE + GENERATIONS x OFFSPRING_SIZE pipelines in total.

       -p POPULATION_SIZE
              Number of individuals to retain in the GP population every  generation.  Generally,
              TPOT  will  work  better  when you give it more individuals (and therefore time) to
              optimize  the  pipeline.  TPOT  will  evaluate  POPULATION_SIZE  +  GENERATIONS   x
              OFFSPRING_SIZE pipelines in total.

       -os OFFSPRING_SIZE
              Number  of offspring to produce in each GP generation.  By default,OFFSPRING_SIZE =
              POPULATION_SIZE.

       -mr MUTATION_RATE
              GP mutation rate in the range [0.0, 1.0]. This tells  the  GP  algorithm  how  many
              pipelines  to  apply  random  changes  to  every generation. We recommend using the
              default  parameter  unless  you  understand  how  the  mutation  rate  affects   GP
              algorithms.

       -xr CROSSOVER_RATE
              GP  crossover  rate  in  the range [0.0, 1.0]. This tells the GP algorithm how many
              pipelines to "breed" every generation. We recommend  using  the  default  parameter
              unless you understand how the crossover rate affects GP algorithms.

       -scoring SCORING_FN
              Function  used  to  evaluate  the  quality  of a given pipeline for the problem. By
              default, accuracy is used for classification problems and mean squared error  (mse)
              is  used  for  regression  problems. Note: If you wrote your own function, set this
              argument to mymodule.myfunctionand TPOT  will  import  your  module  and  take  the
              function  from  there.TPOT  will assume the module can be imported from the current
              workdir.TPOT assumes that any function with "error" or "loss" in the name is  meant
              to  be  minimized,  whereas  any other functions will be maximized. Offers the same
              options as cross_val_score: accuracy, adjusted_rand_score,  average_precision,  f1,
              f1_macro, f1_micro, f1_samples, f1_weighted, neg_log_loss, neg_mean_absolute_error,
              neg_mean_squared_error,  neg_median_absolute_error,   precision,   precision_macro,
              precision_micro,  precision_samples,  precision_weighted, r2, recall, recall_macro,
              recall_micro, recall_samples, recall_weighted, roc_auc

       -cv NUM_CV_FOLDS
              Number  of  folds  to  evaluate   each   pipeline   over   in   stratified   k-fold
              cross-validation during the TPOT optimization process.

       -sub SUBSAMPLE
              Subsample  ratio  of  the training instance. Setting it to 0.5 means that TPOT will
              use a random subsample of half of  training  data  for  the  pipeline  optimization
              process.

       -njobs NUM_JOBS
              Number  of  CPUs  for evaluating pipelines in parallel during the TPOT optimization
              process. Assigning this to -1 will use as many cores as available on the  computer.
              For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs
              but one are used.

       -maxtime MAX_TIME_MINS
              How many minutes TPOT has to optimize the pipeline. If not None, this setting  will
              allow TPOT to run until max_time_mins minutes elapsed and then stop. TPOT will stop
              earlier if generationsis set and all generations are already evaluated.

       -maxeval MAX_EVAL_MINS
              How many minutes TPOT has to evaluate a single pipeline. Setting this parameter  to
              higher values will allow TPOT to explore more complex pipelines but will also allow
              TPOT to run longer.

       -s RANDOM_STATE
              Random number generator seed for reproducibility. Set this seed if  you  want  your
              TPOT run to be reproducible with the same seed and data set in the future.

       -config CONFIG_FILE
              Configuration  file  for customizing the operators and parameters that TPOT uses in
              the optimization process.  Must be a Python module containing a dict  export  named
              "tpot_config" or the name of built-in configuration.

       -template TEMPLATE
              Template  of  predefined pipeline structure. The option is for specifying a desired
              structurefor the machine learning pipeline evaluated in TPOT. So  far  this  option
              only  supportslinear pipeline structure. Each step in the pipeline should be a main
              class of operators(Selector, Transformer, Classifier or Regressor)  or  a  specific
              operator(e.g.   SelectPercentile)  defined  in  TPOT operator configuration. If one
              step is a main class,TPOT will randomly assign all subclass  operators  (subclasses
              of    SelectorMixin,TransformerMixin,    ClassifierMixin   or   RegressorMixin   in
              scikit-learn) to that step.Steps  in  the  template  are  delimited  by  "-",  e.g.
              "SelectPercentile-Transformer-Classifier".By  default  value  of  template is None,
              TPOT generates tree-based pipeline randomly.

       -memory MEMORY
              Path of a directory for pipeline caching or "auto" for using  a  temporary  caching
              directory  during  the optimization process. If supplied, pipelines will cache each
              transformer after fitting them. This feature is used to avoid repeated  computation
              by  transformers  within  a pipeline if the parameters and input data are identical
              with another fitted pipeline during optimization process.

       -cf CHECKPOINT_FOLDER
              If supplied, a folder in which tpot will periodically save the best pipeline so far
              while  optimizing. This is useful in multiple cases: sudden death before tpot could
              save an optimized pipeline, progress tracking, grabbing a pipeline while it's still
              optimizing etc.

       -es EARLY_STOP
              How  many  generations  TPOT checks whether there is no improvement in optimization
              process. End optimization process if there is no improvement in the set  number  of
              generations.

       -v {0,1,2,3}
              How  much information TPOT communicates while it is running: 0 = none, 1 = minimal,
              2 = high, 3 = all. A setting of 2 or higher will add  a  progress  bar  during  the
              optimization procedure.

       -log LOG
              Save progress content to a file

       --version
              Show the TPOT version number and exit.