Ubuntu Manpage: lastal5 - genome-scale comparison of biological sequences

NAME

       lastal5 - genome-scale comparison of biological sequences

SYNOPSIS

       lastal5-plain [options] lastdb-name fasta-sequence-file(s)

DESCRIPTION

       Find and align similar sequences.

   Cosmetic options:
       -h, --help
              show all options and their default settings, and exit

       -V, --version
              show version information, and exit

       -v     be verbose: write messages about what lastal is doing

       -f     output format: TAB, MAF, BlastTab, BlastTab+ (default: MAF)

   E-value options (default settings):
       -D     query letters per random alignment (1e+06)

       -E     maximum expected alignments per square giga (1e+18/D/refSize/numOfStrands)

   Score options (default settings):
       -r     match score   (2 if -M, else  6 if 1<=Q<=4, else 1 if DNA)

       -q     mismatch cost (3 if -M, else 18 if 1<=Q<=4, else 1 if DNA)

       -p     match/mismatch score matrix (protein-protein: BL62, DNA-protein: BL80)

       -X     N/X is ambiguous in: 0=neither sequence, 1=reference, 2=query, 3=both (0)

       -a     gap existence cost (DNA: 7, protein: 11, 1<=Q<=4: 21)

       -b     gap extension cost (DNA: 1, protein:  2, 1<=Q<=4:  9)

       -A     insertion existence cost (a)

       -B     insertion extension cost (b)

       -c     unaligned residue pair cost (off)

       -F     frameshift cost(s) (off)

       -x     maximum score drop for preliminary gapped alignments (z)

       -y     maximum score drop for gapless alignments (min[t*10, x])

       -z     maximum score drop for final gapped alignments (e-1)

       -d     minimum score for gapless alignments (min[e, 2500/n query letters per hit])

       -e     minimum score for gapped alignments

   Initial-match options (default settings):
       -m     maximum initial matches per query position (10)

       -l     minimum length for initial matches (1)

       -L     maximum length for initial matches (infinity)

       -k     use initial matches starting at every k-th position in each query (1)

       -W     use "minimum" positions in sliding windows of W consecutive positions

   Miscellaneous options (default settings):
       -s     strand: 0=reverse, 1=forward, 2=both (2 for DNA, 1 for protein)

       -S     score matrix applies to forward strand of: 0=reference, 1=query (0)

       -K     omit alignments whose query range lies in >= K others with > score (off)

       -C     omit gapless alignments in >= C others with > score-per-length (off)

       -P     number of parallel threads (1)

       -i     query batch size (64M if multi-volume, else off)

       -M     find minimum-difference alignments (faster but cruder)

       -T     type of alignment: 0=local, 1=overlap (0)

       -n     maximum gapless alignments per query position (infinity if m=0, else m)

       -N     stop after the first N alignments per query strand

       -R     lowercase & simple-sequence options (the same as was used by lastdb)

       -u     mask  lowercase during extensions: 0=never, 1=gapless, 2=gapless+postmask, 3=always
              (2 if lastdb -c and Q!=pssm, else 0)

       -w     suppress repeats inside exact matches, offset by <= this distance (1000)

       -G     genetic code (1)

       -t     'temperature' for calculating probabilities (1/lambda)

       -g     'gamma' parameter for gamma-centroid and LAMA (1)

       -j     output type: 0=match counts, 1=gapless, 2=redundant gapped, 3=gapped,

              4=column ambiguity estimates, 5=gamma-centroid, 6=LAMA, 7=expected counts (3)

       -J     score type: 0=ordinary, 1=full (1 for new-style frameshifts, else 0)

       -Q     input format: fastx, keep, sanger, solexa, illumina, prb, pssm

              (default: fasta)

   Split options:
       --split
              do split alignment

       --splice
              do spliced alignment

       --split-f=FMT
              output format: MAF, MAF+

       --split-d=D
              RNA direction: 0=reverse, 1=forward, 2=mixed (default: 1)

       --split-c=PROB
              cis-splice probability per base (default: 0.004)

       --split-t=PROB
              trans-splice probability per base (default: 1e-05)

       --split-M=MEAN
              mean of ln[intron length] (default: 7.0)

       --split-S=SDEV
              standard deviation of ln[intron length] (default: 1.7)

       --split-m=PROB
              maximum mismap probability (default: 1.0)

       --split-s=INT
              minimum alignment score (default: e OR e+t*ln[100])

       --split-n
              write original, not split, alignments

       --split-b=B
              maximum memory (default: 8T for split, 8G for spliced)