Provided by: macsyfinder_1.0.2-1build1_all bug

NAME

       macsyfinder - manual page for macsyfinder 1.0.2

DESCRIPTION

       usage: macsyfinder [-h] [--sequence-db SEQUENCE_DB]

              [--db-type                 {unordered_replicon,ordered_replicon,gembase,unordered}]
              [--replicon-topology  {linear,circular}]  [--topology-file  TOPOLOGY_FILE]  [--idx]
              [--inter-gene-max-space          INTER_GENE_MAX_SPACE         INTER_GENE_MAX_SPACE]
              [--min-mandatory-genes-required                        MIN_MANDATORY_GENES_REQUIRED
              MIN_MANDATORY_GENES_REQUIRED]        [--min-genes-required       MIN_GENES_REQUIRED
              MIN_GENES_REQUIRED]  [--max-nb-genes   MAX_NB_GENES   MAX_NB_GENES]   [--multi-loci
              MULTI_LOCI]   [--hmmer   HMMER_EXE]   [--index-db  INDEX_DB_EXE]  [--e-value-search
              E_VALUE_RES] [--i-evalue-select I_EVALUE_SEL] [--coverage-profile COVERAGE_PROFILE]
              [-d    DEF_DIR]    [-o    OUT_DIR]    [-r    RES_SEARCH_DIR]   [--res-search-suffix
              RES_SEARCH_SUFFIX]  [--res-extract-suffix  RES_EXTRACT_SUFFIX]   [-p   PROFILE_DIR]
              [--profile-suffix  PROFILE_SUFFIX]  [-w  WORKER_NB] [-v] [--log LOG_FILE] [--config
              CFG_FILE] [--previous-run PREVIOUS_RUN] systems [systems ...]

       MacSyFinder is a tool for the detection of protein secretion systems  of  diderm  bacteria
       from a protein dataset.

   positional arguments:
       systems
              The  systems  to detect. This is an obligatory option with no keyword associated to
              it. To detect all the protein secretion systems  and  related  appendages:  set  to
              "all" (case insensitive). Otherwise, a single or multiple systems can be specified.
              For example: "T2SS T4P".

   optional arguments:
       -h, --help
              show this help message and exit

   Input dataset options:
       --sequence-db SEQUENCE_DB
              Path to the sequence dataset in fasta format.

       --db-type {unordered_replicon,ordered_replicon,gembase,unordered}
              The  type  of  dataset  to  deal  with.  "unordered_replicon"  corresponds   to   a
              non-assembled  genome,  "unordered" to a metagenomic dataset, "ordered_replicon" to
              an assembled genome, and "gembase" to a set of replicons where sequence identifiers
              follow this convention: ">RepliconName SequenceID".

       --replicon-topology {linear,circular}
              The  topology  of  the  replicons (this option is meaningful only if the db_type is
              'ordered_replicon' or 'gembase'.

       --topology-file TOPOLOGY_FILE
              Topology file path. The topology file allows one to specify a topology  (linear  or
              circular)  for  each  replicon  (this  option  is meaningful only if the db_type is
              'ordered_replicon' or 'gembase'. A  topology  file  is  a  tabular  file  with  two
              columns:  the  1st  is  the  replicon name, and the 2nd the corresponding topology:
              "RepliconA linear"

       --idx  Forces to build the indexes for the sequence dataset even if they were  presviously
              computed and present at the dataset location (default = False)

   Systems detection options:
       --inter-gene-max-space INTER_GENE_MAX_SPACE INTER_GENE_MAX_SPACE
              Co-localization  criterion:  maximum  number of components non-matched by a profile
              allowed between two matched components for them to be considered contiguous. Option
              only meaningful for 'ordered' datasets. The first value must match to a system, the
              second to a number of components.  This  option  can  be  repeated  several  times:
              "--inter-gene-max-space T2SS 12 --inter-gene-max-space Flagellum 20"

       --min-mandatory-genes-required MIN_MANDATORY_GENES_REQUIRED MIN_MANDATORY_GENES_REQUIRED
              The  minimal  number  of  mandatory genes required for system assessment. The first
              value must correspond to a system name, the second value to an integer. This option
              can   be   repeated   several   times:   "--minmandatory-genes-required   T2SS   15
              --min-mandatorygenes-required Flagellum 10"

       --min-genes-required MIN_GENES_REQUIRED MIN_GENES_REQUIRED
              The  minimal  number  of  genes  required  for  system  assessment  (includes  both
              'mandatory'  and  'accessory'  components).  The  first  value must correspond to a
              system name, the second value to an integer. This option can  be  repeated  several
              times: "--min-genesrequired T2SS 15 --min-genes-required Flagellum 10"

       --max-nb-genes MAX_NB_GENES MAX_NB_GENES
              The  maximal  number  of genes required for system assessment. The first value must
              correspond to a system name, the second value to an integer.  This  option  can  be
              repeated several times: "--max-nb-genes T2SS 5 --max-nb-genes Flagellum 10

       --multi-loci MULTI_LOCI
              Allow  the storage of multi-loci systems for the specified systems. The systems are
              specified as a comma separated list (--multi-loci sys1,sys2) default is False

   Options for Hmmer execution and hits filtering:
       --hmmer HMMER_EXE
              Path to the Hmmer program.

       --index-db INDEX_DB_EXE
              The indexer to be used  for  Hmmer.  The  value  can  be  either  'makeblastdb'  or
              'formatdb' or the path to one of these binary (default = makeblastb)

       --e-value-search E_VALUE_RES
              Maximal e-value for hits to be reported during Hmmer search. (default = 1)

       --i-evalue-select I_EVALUE_SEL
              Maximal  independent  e-value  for  Hmmer hits to be selected for system detection.
              (default = 0.001)

       --coverage-profile COVERAGE_PROFILE
              Minimal profile coverage required in the hit alignment to allow the  hit  selection
              for system detection.  (default = 0.5)

   Path options:
       -d DEF_DIR, --def DEF_DIR
              Path to the systems definition files.

       -o OUT_DIR, --out-dir OUT_DIR
              Path to the directory where to store results. if outdir is specified res-search-dir
              will be ignored.

       -r RES_SEARCH_DIR, --res-search-dir RES_SEARCH_DIR
              Path to the  directory  where  to  store  MacSyFinder  search  results  directories
              (default current working directory).

       --res-search-suffix RES_SEARCH_SUFFIX
              The suffix to give to Hmmer raw output files.

       --res-extract-suffix RES_EXTRACT_SUFFIX
              The suffix to give to filtered hits output files.

       -p PROFILE_DIR, --profile-dir PROFILE_DIR
              Path to the profiles directory.

       --profile-suffix PROFILE_SUFFIX
              The  suffix of profile files. For each 'Gene' element, the corresponding profile is
              searched in the 'profile_dir', in a file which name is based on the Gene name + the
              profile  suffix.  For  instance,  if  the  Gene  is  named 'gspG' and the suffix is
              '.hmm3', then the profile should be placed at the specified location and  be  named
              'gspG.hmm3'

   General options:
       -w WORKER_NB, --worker WORKER_NB
              Number  of  workers  to  be  used by MacSyFinder. In the case the user wants to run
              MacSyFinder in a multithread mode. (0 mean all cores will be used, default 1)

       -v, --verbosity
              Increases the verbosity level.  There  are  4  levels:  Error  messages  (default),
              Warning (-v), Info (-vv) and Debug.(-vvv)

       --log LOG_FILE
              Path to the directory where to store the 'macsyfinder.log' log file.

       --config CFG_FILE
              Path to a putative MacSyFinder configuration file to be used.

       --previous-run PREVIOUS_RUN
              Path  to  a  previous  MacSyFinder  run  directory. It allows one to skip the Hmmer
              search step on same dataset, as it uses previous run results  and  thus  parameters
              regarding  Hmmer  detection.  The configuration file from this previous run will be
              used.   (conflict   with   options   --config,   --sequence-db,   --profile-suffix,
              --resextract-suffix, --e-value-res, --db-type, --hmmer)

       For more details, visit the MacSyFinder website and see the MacSyFinder documentation.

SEE ALSO

       The full documentation for macsyfinder is maintained as a Texinfo manual.  If the info and
       macsyfinder programs are properly installed at your site, the command

              info macsyfinder

       should give you access to the complete manual.