Provided by: obitools_1.2.12+dfsg-2_amd64 bug

NAME

       obitaxonomy - description of obitaxonomy

       The  obitaxonomy command can generate an ecoPCR database from a NCBI taxdump (see NCBI ftp
       site) and allows managing the taxonomic data contained in both types of database.

       Several types of editing are possible:

       Adding a taxon to the database
          The new taxon is described by three values: its scientific name,  its  taxonomic  rank,
          and the taxid of its first ancestor.  Done by using the -a option.

       Deleting a taxon from the database
          Erases a local taxon. Done by using the -D option and specifying a taxid.

       Adding a species to the database
          The  genus of the species must already exist in the database. The species will be added
          under its genus. Done by using the -s option and specifying a species scientific name.

       Adding a preferred scientific name for a taxon in the database
          Adds a preferred name for a taxon in the taxonomy, by specifying the new favorite  name
          and  the  taxid of the taxon whose preferred name should be changed.  Done by using the
          -f option.

       Adding all the taxa  from  a  sequence  file  in  the  ``OBITools``  extended  :doc:`fasta
       <../fasta>` format to the database
          All  the  taxon from a file in the OBITools extended fasta format, and eventually their
          ancestors, are added to the taxonomy database.

          The header of each sequence record must contain the attribute defined by the -k  option
          (default  key:  species_name),  whose  value  is the scientific name of the taxon to be
          added.

          A taxonomic path for each sequence record can be specified with the -p option,  as  the
          attribute key that contains the taxonomic path of the taxon to be added.

          A restricting ancestor can be specified with the -A option, either as a taxid (integer)
          or a key (string). If it is a taxid, this taxid is the default taxid  under  which  the
          new  taxon  is added if none of his ancestors are specified or can be found. If it is a
          key, obitaxonomy looks for the ancestor taxid in the corresponding attribute,  and  the
          new  taxon  is  systematically  added  under this ancestor. By default, the restricting
          ancestor is the root of the taxonomic tree for all the new taxa.

          If neither a path nor an ancestor is specified in the header of  the  sequence  record,
          obitaxonomy tries to read the taxon name as a species name and to find the genus in the
          taxonomic database. If the genus is found, the new taxon is added under it.  If not, it
          is added under the restricting ancestor.

          It  is  highly  recommended checking what was exactly done by reading the output, since
          obitaxonomy uses ad hoc parsing and decision rules.

          Done by using the -F option.

       Notes:

       · When a taxon is added, a new taxid is assigned to it. The minimum for the new taxids can
         be specified by the -m option and is equal to 10000000 by default.

       · For each modification, a line is printed with details on what was done.

OBITAXONOMY SPECIFIC OPTIONS

       -a <TAXON_INFOS>, --add-taxon=<TAXON_INFOS>
                 Adds  a  new  taxon  to the taxonomy. The new taxon is described by three values
                 separated by colons: its scientific name, its taxonomic rank, and the  taxid  of
                 its first ancestor.

              Example:

                     > obitaxonomy -d my_ecopcr_database \
                       -a 'Gentiana alpina':'species':49934

                 Adds a taxon with the scientific name Gentiana alpina and the rank species under
                 the taxon whose taxid is 49934.

       -m <####>, --min-taxid=<####>
                 Minimum taxid for the newly added taxid(s).

              Example:

                     > obitaxonomy -d my_ecopcr_database -m 1000000000 \
                       -a 'Gentiana alpina':'species':49934

                 Adds a taxon with the scientific name Gentiana alpina and the rank species under
                 the  taxon  whose  taxid  is  49934,  with  a  taxid  greater  than  or equal to
                 1000000000.

       -D <TAXID>, --delete-local-taxon=<TAXID>
                 Deletes the local taxon with the taxid <TAXID> from the taxonomic database.

              Example:

                     > obitaxonomy -d my_ecopcr_database -D 10000832

                 Deletes the local taxon with the taxid 10000832 from the taxonomic database.

       -s <SPECIES_NAME>, --add-species=<SPECIES_NAME>
                 Adds a new species to  the  taxonomy.  The  new  species  is  described  by  its
                 scientific  name.  The  genus of the species must already exist in the database.
                 The species will be added under its genus.

              Example:

                     > obitaxonomy -d my_ecopcr_database -s 'Gentiana alpina'

                 Adds the species with the  scientific  name  Gentiana  alpina  under  the  genus
                 Gentiana.

       -f <TAXON_NAME>:<TAXID>, --add-favorite-name=<TAXON_NAME>:<TAXID>
                 Adds  a new favorite scientific name to the taxonomy.  The new name is described
                 by two values separated by a colon: the new favorite name and the taxid  of  the
                 taxon.

              Example:

                     > obitaxonomy -d my_ecopcr_database \
                       -f 'Gentiana algida':50748

                 Adds  the  favorite  scientific  name Gentiana algida for the taxid 50748 in the
                 taxonomic database.

       -F <FILE_NAME>, --file-name=<FILE_NAME>
                 Adds all the taxa from a sequence file in OBITools extended doc:fasta <../fasta>
                 format, and eventually their ancestors to the database (see documentation). Each
                 sequence record must contain the attribute specified by the -k option.

              Example:

                     > obitaxonomy -d my_ecopcr_database \
                       -k my_taxon_name_key -F my_sequences.fasta

                 Adds the taxon of each sequence record from the file my_sequences.fasta  in  the
                 taxonomic   database,   based   on   the   scientific   name  contained  in  the
                 my_taxon_name_key attribute.

       -k <KEY_NAME>, --key-name=<KEY_NAME>
              Works with the -F option. Defines the  key  of  the  attribute  that  contains  the
              scientific name of the taxon to be added. See example above.

       -A <ANCESTOR>, --restricting_ancestor=<ANCESTOR>
                 Works with the -F option. Can be a taxid (integer) or a key (string). If it is a
                 taxid, this taxid is the default taxid under which the new  taxon  is  added  if
                 none  of  his  ancestors  are  specified  or  can  be  found.   If  it is a key,
                 obitaxonomy looks for the ancestor taxid in the corresponding attribute, and the
                 new  taxon  is  systematically  added  under  this  ancestor.   By  default, the
                 restricting ancestor is the root of the taxonomic tree for all the new taxa.

              Example:

                     > obitaxonomy -d my_ecopcr_database -a 33090 \
                       -k my_taxon_name_key -F my_sequences.fasta

                 Adds the taxon of each sequence record from the file my_sequences.fasta  in  the
                 taxonomic   database,   based   on   the   scientific   name  contained  in  the
                 my_taxon_name_key attribute. If the genus of the new taxon cannot be found,  the
                 new taxon is added under the taxon whose taxid is 33090.

       -p <PATH>, --path=<PATH>
                 Works with the -F option. Key of the attribute containing the taxonomic paths of
                 the taxa if they are in the headers of the sequence records. The value contained
                 in  this  attribute  must  be of the form ‘Fungi, Agaricomycetes, Thelephorales,
                 Thelephoraceae’ with the highest ancestors first and commas between ancestors.

              Example:

                     > obitaxonomy -d my_ecopcr_database -p my_taxonomic_path_key \
                       -k my_taxon_name_key -F my_sequences.fasta

                 Adds the taxon of each sequence record from the file my_sequences.fasta  in  the
                 taxonomic   database,   based   on   the   scientific   name  contained  in  the
                 my_taxon_name_key    attribute.     Each    ancestor    contained     in     the
                 my_taxonomic_path_key  attribute  is added if it does not already exist, and the
                 new taxon is added under the latest ancestor of the path.

TAXONOMY RELATED OPTIONS

       -d <FILENAME>, --database=<FILENAME>
              ecoPCR taxonomy Database name

       -t <FILENAME>, --taxonomy-dump=<FILENAME>
              NCBI Taxonomy dump repository name

COMMON OPTIONS

       -h, --help
              Shows this help message and exits.

       --DEBUG
              Sets logging in debug mode.

AUTHOR

       The OBITools Development Team - LECA

COPYRIGHT

       2019 - 2015, OBITool Development Team

 1.02 12                                   Jan 28, 2019                            OBITAXONOMY(1)