Ubuntu Manpage: lambda - the Local Aligner for Massive Biological DatA

name
synopsis
description
options
tuning
legal

NAME

       lambda - the Local Aligner for Massive Biological DatA

SYNOPSIS

       lambda [OPTIONS] -q QUERY.fasta -d DATABASE.fasta [-o output.m8]

DESCRIPTION

       Lambda  is  a  local  aligner  optimized  for  many  query sequences and searches in protein space. It is
       compatible to BLAST, but much faster than BLAST and many other comparable tools.

       Detailed information is available in the wiki: <https://github.com/seqan/lambda/wiki>

OPTIONS

       -h, --help
              Display the help message.

       -hh, --full-help
              Display the help message with advanced options.

       --version-check BOOL
              Turn this option off to disable version update notifications of the application.  One  of  1,  ON,
              TRUE, T, YES, 0, OFF, FALSE, F, and NO. Default: 1.

       --version
              Display version information.

       --copyright
              Display long copyright information.

       -v, --verbosity INTEGER
              Display  more/less diagnostic output during operation: 0 [only errors]; 1 [default]; 2 [+run-time,
              options and statistics]. In range [0..2]. Default: 1.

   Input Options:
       -q, --query INPUT_FILE
              Query sequences. Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*],  .fq[.*],  .fna[.*],
              .ffn[.*],  .fastq[.*],  .fasta[.*],  .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of the
              following extensions: gz, bz2, and bgzf for transparent (de)compression.

       -d, --database INPUT_FILE
              Path to original database sequences (a precomputed index with .sa or .fm needs to  exist!).  Valid
              filetypes  are:  .sam[.*],  .raw[.*], .gbk[.*], .frn[.*], .fq[.*], .fna[.*], .ffn[.*], .fastq[.*],
              .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of  the  following  extensions:
              gz, bz2, and bgzf for transparent (de)compression.

       -di, --db-index-type STRING
              database index is in this format. One of sa and fm. Default: fm.

   Output Options:
       -o, --output OUTPUT_FILE
              File  to  hold  reports  on  hits (.m* are blastall -m* formats; .m8 is tab-seperated, .m9 is tab-
              seperated with with comments, .m0 is pairwise format). Valid  filetypes  are:  .sam[.*],  .m9[.*],
              .m8[.*],  .m0[.*],  and  .bam,  where  * is any of the following extensions: gz, bz2, and bgzf for
              transparent (de)compression. Default: output.m8.

       -oc, --output-columns STRING
              Print specified column combination and/or order (.m8 and .m9 outputs only); call -oc help for more
              details. Default: std.

       -id, --percent-identity INTEGER
              Output  only  matches  above  this  threshold  (checked  before e-value check). In range [0..100].
              Default: 0.

       -e, --e-value DOUBLE
              Output only matches that score below this threshold. In range [0..inf]. Default: 0.1.

       -nm, --num-matches INTEGER
              Print at most this number of matches per query. In range [1..inf]. Default: 500.

       --sam-with-refheader STRING
              BAM files require all subject names to be written to the header. For SAM this is not required,  so
              Lambda  does  not  automatically  do  it  to save space (especially for protein database this is a
              lot!). If you still want them with SAM, e.g. for better BAM compatibility, use this option. One of
              on and off. Default: off.

       --sam-bam-seq STRING
              Write  matching  DNA  subsequence  into SAM/BAM file (BLASTN). For BLASTX and TBLASTX the matching
              protein sequence is "untranslated" and positions  retransformed  to  the  original  sequence.  For
              BLASTP  and  TBLASTN  there is no DNA sequence so a "*" is written to the SEQ column. The matching
              protein sequence can be written as an optional tag, see --sam-bam-tags. If set to  uniq  than  the
              sequence  is omitted iff it is identical to the previous match's subsequence. One of always, uniq,
              and never. Default: uniq.

       --sam-bam-tags STRING
              Write the specified optional columns to the  SAM/BAM  file.  Call  --sam-bam-tags  help  for  more
              details. Default: AS NM ZE ZI ZF.

       --sam-bam-clip STRING
              Whether  to  hard-clip  or soft-clip the regions beyond the local match. Soft-clipping retains the
              full sequence in the output file, but obviously uses more space. One of hard  and  soft.  Default:
              hard.

   General Options:
       -t, --threads INTEGER
              number of threads to run concurrently.

       -qi, --query-index-type STRING
              controls double-indexing. One of radix and none. Default: none.

   Alphabets and Translation:
       -p, --program STRING
              Blast Operation Mode. One of blastn, blastp, blastx, tblastn, and tblastx. Default: blastx.

       -g, --genetic-code INTEGER
              The  translation  table  to  use  for  nucl -> amino acid translation(not for BlastN, BlastP). See
              https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c for ids (default is generic).  Six
              frames are generated. Default: 1.

       -ar, --alphabet-reduction STRING
              Alphabet  Reduction  for  seeding  phase  (ignored for BLASTN). One of none and murphy10. Default:
              murphy10.

   Seeding / Filtration:
       -sl, --seed-length INTEGER
              Length of the seeds (default = 14 for BLASTN). Default: 10.

       -so, --seed-offset INTEGER
              Offset for seeding (if unset = seed-length, non-overlapping; default = 5 for BLASTN). Default: 10.

       -sd, --seed-delta INTEGER
              maximum seed distance. Default: 1.

   Miscellaneous Heuristics:
       -ps, --pre-scoring INTEGER
              evaluate score of a region NUM times the size of the seed before extension (0 -> no pre-scoring, 1
              ->  evaluate  seed,  n-> area around seed, as well; default = 1 if no reduction is used). In range
              [1..inf]. Default: 2.

       -pt, --pre-scoring-threshold DOUBLE
              minimum average score per position in pre-scoring region. Default: 2.

       -pd, --filter-putative-duplicates STRING
              filter hits that will likely duplicate a match already found. One of on and off. Default: on.

       -pa, --filter-putative-abundant STRING
              If the maximum number of matches per query are found already,  stop  searching  if  the  remaining
              realm looks unfeasable. One of on and off. Default: on.

   Scoring:
       -sc, --scoring-scheme INTEGER
              use  '45'  for  Blosum45;  '62'  for  Blosum62  (default); '80' for Blosum80; [ignored for BlastN]
              Default: 62.

       -ge, --score-gap INTEGER
              Score per gap character (default = -2 for BLASTN). Default: -1.

       -go, --score-gap-open INTEGER
              Additional cost for opening gap (default = -5 for BLASTN). Default: -11.

       -ma, --score-match INTEGER
              Match score [only BLASTN]) Default: 2.

       -mi, --score-mismatch INTEGER
              Mismatch score [only BLASTN] Default: -3.

   Extension:
       -x, --x-drop INTEGER
              Stop Banded extension if score x below the maximum seen (-1 means no xdrop). In  range  [-1..inf].
              Default: 30.

       -b, --band INTEGER
              Size  of  the  DP-band  used  in  extension (-3 means log2 of query length; -2 means sqrt of query
              length; -1 means full dp; n means band of size 2n+1) In range [-3..inf]. Default: -3.

TUNING

       Tuning the seeding parameters and (de)activating alphabet reduction has a strong influence on both  speed
       and sensitivity. We recommend the following alternative profiles for protein searches:

       fast (high similarity):       -ar none -sl 7 -sd 0

       sensitive (lower similarity): -so 5

       For further information see the wiki: <https://github.com/seqan/lambda/wiki>

LEGAL

       lambda  Copyright:  2013-2017 Hannes Hauswedell, released under the GNU GPL v3 (or later); 2016-2017 Knut
       Reinert and Freie Universität Berlin, released under the 3-clause-BSDL
       SeqAn Copyright: 2006-2015 Knut Reinert, FU-Berlin; released under the 3-clause BSDL.
       In your academic works please cite: Hauswedell et al (2014); doi: 10.1093/bioinformatics/btu439
       For full copyright and/or warranty information see --copyright.