Ubuntu Manpage: diamond - accelerated BLAST compatible local sequence aligner

Provided by: diamond-aligner_2.1.8-1_amd64

NAME

       diamond - accelerated BLAST compatible local sequence aligner

DESCRIPTION

       diamond  v2.0.13.151  (C) Max Planck Society for the Advancement of Science Documentation,
       support   and   updates   available   at   http://www.diamondsearch.org    Please    cite:
       http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

       Syntax: diamond COMMAND [OPTIONS]

       Commands:  makedb  Build DIAMOND database from a FASTA file blastp  Align amino acid query
       sequences against a protein reference database blastx  Align DNA query sequences against a
       protein  reference  database  view     View DIAMOND alignment archive (DAA) formatted file
       help     Produce  help  message  version  Display  version  information  getseq   Retrieve
       sequences  from a DIAMOND database file dbinfo  Print information about a DIAMOND database
       file test    Run regression tests makeidx Make database index

       General  options:   --threads   (-p)             number   of   CPU   threads   --db   (-d)
       database file --out (-o)               output file --outfmt (-f)            output format

       0      = BLAST pairwise

       5      = BLAST XML

       6      = BLAST tabular

              100 = DIAMOND alignment archive (DAA) 101 = SAM

              Value 6 may be followed by a space-separated list of these keywords:

              qseqid  means  Query Seq - id qlen means Query sequence length sseqid means Subject
              Seq - id sallseqid means All subject Seq - id(s), separated by  a  ';'  slen  means
              Subject  sequence length qstart means Start of alignment in query qend means End of
              alignment in query sstart means Start of alignment in subject  send  means  End  of
              alignment  in  subject  qseq  means  Aligned part of query sequence qseq_translated
              means Aligned part of query sequence (translated) full_qseq  means  Query  sequence
              full_qseq_mate  means Query sequence of the mate sseq means Aligned part of subject
              sequence full_sseq means Subject sequence evalue means Expect value bitscore  means
              Bit  score  score  means  Raw  score  length  means  Alignment  length pident means
              Percentage of identical matches nident means Number of identical  matches  mismatch
              means  Number  of  mismatches  positive  means Number of positive - scoring matches
              gapopen means Number of gap openings gaps means Total number  of  gaps  ppos  means
              Percentage  of positive - scoring matches qframe means Query frame btop means Blast
              traceback operations(BTOP) cigar means CIGAR string staxids  means  unique  Subject
              Taxonomy  ID(s),  separated  by  a  ';' (in numerical order) sscinames means unique
              Subject Scientific Name(s), separated by a  ';'  sskingdoms  means  unique  Subject
              Super  Kingdom(s),  separated  by  a ';' skingdoms means unique Subject Kingdom(s),
              separated by a ';' sphylums means unique Subject  Phylum(s),  separated  by  a  ';'
              stitle  means  Subject  Title salltitles means All Subject Title(s), separated by a
              '<>' qcovhsp means Query Coverage Per HSP scovhsp means Subject  Coverage  Per  HSP
              qtitle  means  Query title qqual means Query quality values for the aligned part of
              the query full_qqual means Query quality values qstrand means Query strand

              Default: qseqid sseqid pident length  mismatch  gapopen  qstart  qend  sstart  send
              evalue bitscore

       --verbose  (-v)           verbose console output --log                    enable debug log
       --quiet                  disable  console  output  --header                  Write  header
       lines to blast tabular format.

       Makedb  options:  --in                     input reference file in FASTA format --taxonmap
       protein accession to taxid mapping file --taxonnodes             taxonomy  nodes.dmp  from
       NCBI --taxonnames             taxonomy names.dmp from NCBI

       Aligner  options: --query (-q)             input query file --strand                 query
       strands to search (both/minus/plus) --un                     file  for  unaligned  queries
       --al                      file  or  aligned  queries  --unfmt                   format  of
       unaligned query file (fasta/fastq) --alfmt                  format of aligned  query  file
       (fasta/fastq)    --unal                     report   unaligned   queries   (0=no,   1=yes)
       --max-target-seqs (-k)   maximum number of  target  sequences  to  report  alignments  for
       (default=25)  --top                     report  alignments within this percentage range of
       top alignment score (overrides --max-target-seqs) --max-hsps               maximum  number
       of  HSPs  per  target  sequence  to  report  for  each  query  (default=1) --range-culling
       restrict hit culling to overlapping query ranges --compress               compression  for
       output  files  (0=none,  1=gzip,  zstd) --evalue (-e)            maximum e-value to report
       alignments (default=0.001) --min-score              minimum bit score to report alignments
       (overrides  e-value  setting)  --id                      minimum  identity%  to  report an
       alignment  --query-cover             minimum  query  cover%   to   report   an   alignment
       --subject-cover            minimum   subject   cover%   to   report  an  alignment  --fast
       enable  fast  mode  --mid-sensitive            enable   mid-sensitive   mode   --sensitive
       enable    sensitive    mode)   --more-sensitive           enable   more   sensitive   mode
       --very-sensitive         enable very sensitive mode --ultra-sensitive        enable  ultra
       sensitive  mode  --iterate                 iterated  search  with  increasing  sensitivity
       --global-ranking  (-g)     number  of  targets  for  global  ranking   --block-size   (-b)
       sequence  block  size in billions of letters (default=2.0) --index-chunks (-c)      number
       of  chunks  for  index  processing  (default=4)  --tmpdir  (-t)             directory  for
       temporary   files   --parallel-tmpdir          directory   for  temporary  files  used  by
       multiprocessing --gapopen                gap  open  penalty  --gapextend               gap
       extension   penalty   --frameshift  (-F)         frame  shift  penalty  (default=disabled)
       --long-reads               short   for   --range-culling   --top   10   -F   15   --matrix
       score  matrix  for  protein  alignment  (default=BLOSUM62)  --custom-matrix           file
       containing custom scoring matrix  --comp-based-stats        composition  based  statistics
       mode   (0-4)   --masking                 masking  algorithm  (none,  seg,  tantan=default)
       --query-gencode          genetic  code  to  use  to  translate  query  (see  user  manual)
       --salltitles               include   full   subject   titles   in   DAA  file  --sallseqid
       include all subject  ids  in  DAA  file  --no-self-hits            suppress  reporting  of
       identical  self  hits  --taxonlist               restrict  search  to  list  of  taxon ids
       (comma-separated) --taxon-exclude          exclude list  of  taxon  ids  (comma-separated)
       --seqidlist               filter  the database by list of accessions --skip-missing-seqids
       ignore accessions missing in the database

       Advanced      options:      --algo                        Seed      search       algorithm
       (0=double-indexed/1=query-indexed/ctg=contiguous-seed)  --bin                    number of
       query bins for seed search --min-orf (-l)           ignore translated sequences without an
       open  reading  frame  of  at  least  this  length --seed-cut               cutoff for seed
       complexity   --freq-masking             mask   seeds   based   on   frequency    --freq-sd
       number    of   standard   deviations   for   ignoring   frequent   seeds   --motif-masking
       softmask abundant motifs (0/1) --id2                    minimum number of  identities  for
       stage  1  hit --xdrop (-x)             xdrop for ungapped alignment --gapped-filter-evalue
       E-value threshold for gapped  filter  (auto)  --band                    band  for  dynamic
       programming  computation  --shapes  (-s)             number  of  seed  shapes (default=all
       available)   --shape-mask               seed   shapes   --multiprocessing           enable
       distributed-memory parallel processing --mp-init                initialize multiprocessing
       run  --mp-recover              enable  continuation  of  interrupted  multiprocessing  run
       --mp-query-chunk          process  only a single query chunk as specified --ext-chunk-size
       chunk size for adaptive ranking (default=auto)  --no-ranking              disable  ranking
       heuristic    --ext                       Extension   mode   (banded-fast/banded-slow/full)
       --culling-overlap        minimum range overlap with higher scoring hit  to  delete  a  hit
       (default=50%)  --taxon-k                 maximum  number  of targets to report per species
       --range-cover            percentage of  query  range  to  be  covered  for  range  culling
       (default=50%)    --dbsize                    effective    database   size   (in   letters)
       --no-auto-append          disable  auto  appending  of  DAA  and  DMND   file   extensions
       --xml-blord-format        Use gnl|BL_ORD_ID| style format in XML output --stop-match-score
       Set the match score of stop codons against each other.   --tantan-minMaskProb      minimum
       repeat  probability for masking (default=0.9) --file-buffer-size       file buffer size in
       bytes (default=67108864) --memory-limit (-M)      Memory limit for extension stage  in  GB
       --no-unlink               Do  not unlink temporary files.  --target-indexed         Enable
       target-indexed mode --ignore-warnings        Ignore warnings

       View options: --daa (-a)               DIAMOND alignment archive (DAA) file  --forwardonly
       only show alignments of forward strand

       Getseq  options:  --seq                     Space-separated  list  of  sequence numbers to
       display.

       Online documentation at http://www.diamondsearch.org

AUTHOR

        This manpage was written by Nilesh Patra for the Debian distribution and
        can be used for any other usage of the program.