Provided by: lambda-align2_2.0.0-2build1_amd64
NAME
lambda2 mkindexp - the Local Aligner for Massive Biological DatA
SYNOPSIS
lambda2 mkindexp [OPTIONS] -d DATABASE.fasta [-i INDEX.lambda]
DESCRIPTION
Lambda is a local aligner optimized for many query sequences and searches in protein space. It is compatible to BLAST, but much faster than BLAST and many other comparable tools. Detailed information is available in the wiki: <https://github.com/seqan/lambda/wiki> This is the indexer_binary for creating lambda-compatible databases.
OPTIONS
-h, --help Display the help message. -hh, --full-help Display the help message with advanced options. --version Display version information. --copyright Display long copyright information. -v, --verbosity INTEGER Display more/less diagnostic output during operation: 0 [only errors]; 1 [default]; 2 [+run-time, options and statistics]. In range [0..2]. Default: 1. Input Options: -d, --database INPUT_FILE Database sequences. Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*], .fq[.*], .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression. -m, --acc-tax-map INPUT_FILE An NCBI or UniProt accession-to-taxid mapping file. Download from ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/ or ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/idmapping/ . Valid filetypes are: .dat[.*] and .accession2taxid[.*], where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression. -x, --tax-dump-dir INPUT_DIRECTORY A directory that contains nodes.dmp and names.dmp; unzipped from ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz Output Options: -i, --index OUTPUT_DIRECTORY The output directory for the index files (defaults to "DATABASE.lambda"). Valid filetype is: .lambda. --db-index-type STRING Suffix array or full-text minute space. One of fm and bifm. Default: fm. --truncate-ids BOOL Truncate IDs at first whitespace. This saves a lot of space and is irrelevant for all LAMBDA output formats other than BLAST Pairwise (.m0). One of 1, ON, TRUE, T, YES, 0, OFF, FALSE, F, and NO. Default: on. Alphabets and Translation: -a, --input-alphabet STRING Alphabet of the database sequences (specify to override auto-detection); if input is Dna, it will be translated. One of auto, dna5, and aminoacid. Default: auto. -g, --genetic-code INTEGER The translation table to use if input is Dna. See https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c for ids (default is generic). Default: 1. -r, --alphabet-reduction STRING Alphabet Reduction for seeding phase. One of none and murphy10. Default: murphy10. Algorithm: --algorithm STRING Algorithm for SA construction (also used for FM; see Memory Requirements below!). One of mergesort, quicksortbuckets, quicksort, radixsort, and skew7ext. Default: radixsort. -t, --threads INTEGER number of threads to run concurrently (ignored if a == skew7ext). In range [1..40]. Default: 4. --tmp-dir OUTPUT_DIRECTORY temporary directory used by skew, defaults to working directory. Default: /build/lambda-align2-_d4I_T/lambda-align2-2.0.0/build/src.
REMARKS
Please see the wiki (<https://github.com/seqan/lambda/wiki>) for more information on which indexes to chose and which algorithms to pick. Note that the indexes created are binary and not compatible between different CPU endiannesses. Also the on-disk format is still subject to change between Lambda versions.
LEGAL
lambda2 mkindexp Copyright: 2013-2019 Hannes Hauswedell, released under the GNU AGPL v3 (or later); 2016-2019 Knut Reinert and Freie Universität Berlin, released under the 3-clause-BSDL SeqAn Copyright: 2006-2015 Knut Reinert, FU-Berlin; released under the 3-clause BSDL. In your academic works please cite: Hauswedell et al (2014); doi: 10.1093/bioinformatics/btu439 For full copyright and/or warranty information see --copyright.