Ubuntu Manpage: lambda - the Local Aligner for Massive Biological DatA

NAME

       lambda - the Local Aligner for Massive Biological DatA

SYNOPSIS

       lambda [OPTIONS] -q QUERY.fasta -d DATABASE.fasta [-o output.m8]

DESCRIPTION

       Lambda  is  a  local  aligner  optimized  for many query sequences and searches in protein
       space. It is compatible to BLAST, but much faster than BLAST  and  many  other  comparable
       tools.

       Detailed information is available in the wiki: <https://github.com/seqan/lambda/wiki>

OPTIONS

       -h, --help
              Display the help message.

       -hh, --full-help
              Display the help message with advanced options.

       --version-check BOOL
              Turn  this  option  off to disable version update notifications of the application.
              One of 1, ON, TRUE, T, YES, 0, OFF, FALSE, F, and NO. Default: 1.

       --version
              Display version information.

       --copyright
              Display long copyright information.

       -v, --verbosity INTEGER
              Display more/less diagnostic output during operation: 0 [only errors]; 1 [default];
              2 [+run-time, options and statistics]. In range [0..2]. Default: 1.

   Input Options:
       -q, --query INPUT_FILE
              Query  sequences.  Valid  filetypes  are:  .sam[.*],  .raw[.*], .gbk[.*], .frn[.*],
              .fq[.*], .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*],  .embl[.*],
              and  .bam,  where  *  is  any  of  the  following extensions: gz, bz2, and bgzf for
              transparent (de)compression.

       -d, --database INPUT_FILE
              Path to original database sequences (a precomputed index with .sa or .fm  needs  to
              exist!).  Valid  filetypes  are:  .sam[.*],  .raw[.*], .gbk[.*], .frn[.*], .fq[.*],
              .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam,
              where  *  is  any  of  the  following extensions: gz, bz2, and bgzf for transparent
              (de)compression.

       -di, --db-index-type STRING
              database index is in this format. One of sa and fm. Default: fm.

   Output Options:
       -o, --output OUTPUT_FILE
              File to hold reports on hits (.m* are blastall -m* formats; .m8  is  tab-seperated,
              .m9  is  tab-seperated with with comments, .m0 is pairwise format). Valid filetypes
              are: .sam[.*], .m9[.*], .m8[.*], .m0[.*], and .bam, where * is any of the following
              extensions: gz, bz2, and bgzf for transparent (de)compression. Default: output.m8.

       -oc, --output-columns STRING
              Print  specified  column  combination and/or order (.m8 and .m9 outputs only); call
              -oc help for more details. Default: std.

       -id, --percent-identity INTEGER
              Output only matches above this threshold (checked before e-value check).  In  range
              [0..100]. Default: 0.

       -e, --e-value DOUBLE
              Output  only  matches  that score below this threshold. In range [0..inf]. Default:
              0.1.

       -nm, --num-matches INTEGER
              Print at most this number of matches per query. In range [1..inf]. Default: 500.

       --sam-with-refheader STRING
              BAM files require all subject names to be written to the header. For  SAM  this  is
              not  required, so Lambda does not automatically do it to save space (especially for
              protein database this is a lot!). If you still want them with SAM, e.g. for  better
              BAM compatibility, use this option. One of on and off. Default: off.

       --sam-bam-seq STRING
              Write  matching  DNA subsequence into SAM/BAM file (BLASTN). For BLASTX and TBLASTX
              the matching protein sequence is "untranslated" and positions retransformed to  the
              original  sequence.  For  BLASTP  and  TBLASTN there is no DNA sequence so a "*" is
              written to the SEQ column. The matching protein  sequence  can  be  written  as  an
              optional  tag,  see --sam-bam-tags. If set to uniq than the sequence is omitted iff
              it is identical to the previous match's  subsequence.  One  of  always,  uniq,  and
              never. Default: uniq.

       --sam-bam-tags STRING
              Write  the specified optional columns to the SAM/BAM file. Call --sam-bam-tags help
              for more details. Default: AS NM ZE ZI ZF.

       --sam-bam-clip STRING
              Whether to hard-clip or soft-clip the regions beyond the local match. Soft-clipping
              retains the full sequence in the output file, but obviously uses more space. One of
              hard and soft. Default: hard.

   General Options:
       -t, --threads INTEGER
              number of threads to run concurrently.

       -qi, --query-index-type STRING
              controls double-indexing. One of radix and none. Default: none.

   Alphabets and Translation:
       -p, --program STRING
              Blast Operation Mode. One of blastn, blastp, blastx, tblastn, and tblastx. Default:
              blastx.

       -g, --genetic-code INTEGER
              The  translation  table  to  use for nucl -> amino acid translation(not for BlastN,
              BlastP).  See  https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c  for
              ids (default is generic). Six frames are generated. Default: 1.

       -ar, --alphabet-reduction STRING
              Alphabet  Reduction  for  seeding  phase  (ignored  for  BLASTN).  One  of none and
              murphy10. Default: murphy10.

   Seeding / Filtration:
       -sl, --seed-length INTEGER
              Length of the seeds (default = 14 for BLASTN). Default: 10.

       -so, --seed-offset INTEGER
              Offset for seeding (if unset  =  seed-length,  non-overlapping;  default  =  5  for
              BLASTN). Default: 10.

       -sd, --seed-delta INTEGER
              maximum seed distance. Default: 1.

   Miscellaneous Heuristics:
       -ps, --pre-scoring INTEGER
              evaluate score of a region NUM times the size of the seed before extension (0 -> no
              pre-scoring, 1 -> evaluate seed, n-> area around seed, as well; default = 1  if  no
              reduction is used). In range [1..inf]. Default: 2.

       -pt, --pre-scoring-threshold DOUBLE
              minimum average score per position in pre-scoring region. Default: 2.

       -pd, --filter-putative-duplicates STRING
              filter  hits  that  will likely duplicate a match already found. One of on and off.
              Default: on.

       -pa, --filter-putative-abundant STRING
              If the maximum number of matches per query are found already, stop searching if the
              remaining realm looks unfeasable. One of on and off. Default: on.

   Scoring:
       -sc, --scoring-scheme INTEGER
              use '45' for Blosum45; '62' for Blosum62 (default); '80' for Blosum80; [ignored for
              BlastN] Default: 62.

       -ge, --score-gap INTEGER
              Score per gap character (default = -2 for BLASTN). Default: -1.

       -go, --score-gap-open INTEGER
              Additional cost for opening gap (default = -5 for BLASTN). Default: -11.

       -ma, --score-match INTEGER
              Match score [only BLASTN]) Default: 2.

       -mi, --score-mismatch INTEGER
              Mismatch score [only BLASTN] Default: -3.

   Extension:
       -x, --x-drop INTEGER
              Stop Banded extension if score x below the maximum seen (-1  means  no  xdrop).  In
              range [-1..inf]. Default: 30.

       -b, --band INTEGER
              Size of the DP-band used in extension (-3 means log2 of query length; -2 means sqrt
              of query length; -1 means full dp; n means band of size 2n+1) In  range  [-3..inf].
              Default: -3.

TUNING

       Tuning the seeding parameters and (de)activating alphabet reduction has a strong influence
       on both speed and sensitivity. We recommend the following alternative profiles for protein
       searches:

       fast (high similarity):       -ar none -sl 7 -sd 0

       sensitive (lower similarity): -so 5

       For further information see the wiki: <https://github.com/seqan/lambda/wiki>

LEGAL

       lambda  Copyright:  2013-2017 Hannes Hauswedell, released under the GNU GPL v3 (or later);
       2016-2017 Knut Reinert and Freie Universität Berlin, released under the 3-clause-BSDL
       SeqAn Copyright: 2006-2015 Knut Reinert, FU-Berlin; released under the 3-clause BSDL.
       In   your   academic   works   please   cite:    Hauswedell    et    al    (2014);    doi:
       10.1093/bioinformatics/btu439
       For full copyright and/or warranty information see --copyright.