Ubuntu Manpage: last-train - Try to find suitable score parameters for aligning the given sequences

NAME

       last-train - Try to find suitable score parameters for aligning the given sequences

SYNOPSIS

       last-train [options] lastdb-name sequence-file(s)

DESCRIPTION

       Try to find suitable score parameters for aligning the given sequences.

OPTIONS

       -h, --help
              show this help message and exit

       -v, --verbose
              show more details of intermediate steps

              Training options:

       --revsym
              force reverse-complement symmetry

       --matsym
              force symmetric substitution matrix

       --gapsym
              force insertion/deletion symmetry

       --pid=PID
              skip alignments with > PID% identity (default: 100)

       --postmask=NUMBER
              skip mostly-lowercase alignments (default=1)

       --sample-number=N
              number of random sequence samples (default: 20000 if --codon else 500)

       --sample-length=L
              length of each sample (default: 2000)

       --scale=S
              output scores in units of 1/S bits

       --codon
              DNA queries & protein reference, with frameshifts

              Initial parameter options:

       -r SCORE
              match score   (default:  6 if Q>=1, or 5 if DNA, or 12)

       -q COST
              mismatch cost (default: 18 if Q>=1, or 5 if DNA, or  7)

       -p NAME
              match/mismatch score matrix

       -a COST
              gap existence cost (default: 21 if Q>=1, else 15)

       -b COST
              gap extension cost (default: 9 if Q>=1, else 3)

       -A COST
              insertion existence cost

       -B COST
              insertion extension cost

       -F LIST
              frameshift probabilities: del-1,del-2,ins+1,ins+2 (default: 1-b,1-b,1-B,1-B)

              Alignment options:

       -D LENGTH
              query letters per random alignment (default: 1e6)

       -E EG2 maximum expected alignments per square giga

       -s STRAND
              0=reverse, 1=forward, 2=both (default: 2 if DNA, else 1)

       -S NUMBER
              score matrix applies to forward strand of: 0=reference, 1=query (default: 1)

       -C COUNT
              omit gapless alignments in COUNT others with > scoreper-length

       -T NUMBER
              type of alignment: 0=local, 1=overlap (default: 0)

       -R DIGITS
              lowercase & simple-sequence options

       -m COUNT
              maximum initial matches per query position (default: 10)

       -k STEP
              use initial matches starting at every STEP-th position in each query (default: 1)

       -P THREADS
              number of parallel threads

       -X NUMBER
              N/X is ambiguous in: 0=neither sequence, 1=reference, 2=query, 3=both (default=0)

       -Q NAME
              input format: fastx, sanger (default=fasta)