Provided by: hhsuite_2.0.16-6_amd64
NAME
hhalign - align a query alignment/HMM to a template alignment/HMM
SYNOPSIS
hhalign -i query [-t template] [options]
DESCRIPTION
HHalign version 2.0.16 (January 2013) Align a query alignment/HMM to a template alignment/HMM by HMM-HMM alignment If only one alignment/HMM is given it is compared to itself and the best off-diagonal alignment plus all further non-overlapping alignments above significance threshold are shown. Remmert M, Biegert A, Hauser A, and Soding J. HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat. Methods 9:173-175 (2011). (C) Johannes Soeding, Michael Remmert, Andreas Biegert, Andreas Hauser -i <file> input query alignment (fasta/a2m/a3m) or HMM file (.hhm) -t <file> input template alignment (fasta/a2m/a3m) or HMM file (.hhm) Output options: -o <file> write output alignment to file -ofas <file> write alignments in FASTA, A2M (-oa2m) or A3M (-oa3m) format -Oa3m <file> write query alignment in a3m format to file (default=none) -Aa3m <file> append query alignment in a3m format to file (default=none) -atab <file> write alignment as a table (with posteriors) to file (default=none) -index <file> use given alignment to calculate Viterbi score (default=none) -v <int> verbose mode: 0:no screen output 1:only warings 2: verbose -seq [1,inf[ max. number of query/template sequences displayed (def=1) -nocons don't show consensus sequence in alignments (default=show) -nopred don't show predicted 2ndary structure in alignments (default=show) -nodssp don't show DSSP 2ndary structure in alignments (default=show) -ssconf show confidences for predicted 2ndary structure in alignments -aliw int number of columns per line in alignment list (def=80) -P <float> for self-comparison: max p-value of alignments (def=0.001 -p <float> minimum probability in summary and alignment list (def=0) -E <float> maximum E-value in summary and alignment list (def=1E+06) -Z <int> maximum number of lines in summary hit list (def=100) -z <int> minimum number of lines in summary hit list (def=1) -B <int> maximum number of alignments in alignment list (def=100) -b <int> minimum number of alignments in alignment list (def=1) -rank int specify rank of alignment to write with -Oa3m or -Aa3m option (default=1) Filter input alignment (options can be combined): -id [0,100] maximum pairwise sequence identity (%) (def=90) -diff [0,inf[ filter most diverse set of sequences, keeping at least this many sequences in each block of >50 columns (def=100) -cov [0,100] minimum coverage with query (%) (def=0) -qid [0,100] minimum sequence identity with query (%) (def=0) -qsc [0,100] minimum score per column with query (def=-20.0) Input alignment format: -M a2m use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted) -M first use FASTA: columns with residue in 1st sequence are match states -M [0,100] use FASTA: columns with fewer than X% gaps are match states HMM-HMM alignment options: -glob/-loc global or local alignment mode (def=local) -alt <int> show up to this number of alternative alignments (def=1) -realign realign displayed hits with max. accuracy (MAC) algorithm -norealign do NOT realign displayed hits with MAC algorithm (def=realign) -mact [0,1[ posterior probability threshold for MAC alignment (def=0.350) A threshold value of 0.0 yields global alignments. -sto <int> use global stochastic sampling algorithm to sample this many alignments -excl <range> exclude query positions from the alignment, e.g. '1-33,97-168' -shift [-1,1] score offset (def=-0.030) -corr [0,1] weight of term for pair correlations (def=0.10) -ssm 0-4 0:no ss scoring [default=2] 1:ss scoring after alignment 2:ss scoring during alignment -ssw [0,1] weight of ss score (def=0.11) -def read default options from ./.hhdefaults or <home>/.hhdefault. Example: hhalign -i T0187.a3m -t d1hz4a_.hhm -png T0187pdb.png Output options: -o <file> write output alignment to file -ofas <file> write alignments in FASTA, A2M (-oa2m) or A3M (-oa3m) format -Oa3m <file> write query alignment in a3m format to file (default=none) -Aa3m <file> append query alignment in a3m format to file (default=none) -atab <file> write alignment as a table (with posteriors) to file (default=none) -v <int> verbose mode: 0:no screen output 1:only warings 2: verbose -seq [1,inf[ max. number of query/template sequences displayed (def=1) -nocons don't show consensus sequence in alignments (default=show) -nopred don't show predicted 2ndary structure in alignments (default=show) -nodssp don't show DSSP 2ndary structure in alignments (default=show) -ssconf show confidences for predicted 2ndary structure in alignments -aliw int number of columns per line in alignment list (def=80) -P <float> for self-comparison: max p-value of alignments (def=0.001 -p <float> minimum probability in summary and alignment list (def=0) -E <float> maximum E-value in summary and alignment list (def=1E+06) -Z <int> maximum number of lines in summary hit list (def=100) -z <int> minimum number of lines in summary hit list (def=1) -B <int> maximum number of alignments in alignment list (def=100) -b <int> minimum number of alignments in alignment list (def=1) -rank int specify rank of alignment to write with -Oa3m or -Aa3m option (default=1) -tc <file> write a TCoffee library file for the pairwise comparison -tct [0,100] min. probobability of residue pairs for TCoffee (def=5%) Options to filter input alignment (options can be combined): -id [0,100] maximum pairwise sequence identity (%) (def=90) -diff [0,inf[ filter most diverse set of sequences, keeping at least this many sequences in each block of >50 columns (def=100) -cov [0,100] minimum coverage with query (%) (def=0) -qid [0,100] minimum sequence identity with query (%) (def=0) -qsc [0,100] minimum score per column with query (def=-20.0) HMM-building options: -M a2m use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted) -M first use FASTA: columns with residue in 1st sequence are match states -M [0,100] use FASTA: columns with fewer than X% gaps are match states -tags do NOT neutralize His-, C-myc-, FLAG-tags, and trypsin recognition sequence to background distribution Pseudocount (pc) options: -pcm 0-2 position dependence of pc admixture 'tau' (pc mode, default=2) 0: no pseudo counts: tau = 0 1: constant tau = a 2: diversity-dependent: tau = a/(1 + ((Neff[i]-1)/b)^c) (Neff[i]: number of effective seqs in local MSA around column i) 3: constant diversity pseudocounts -pca [0,1] overall pseudocount admixture (def=1.0) -pcb [1,inf[ Neff threshold value for -pcm 2 (def=1.5) -pcc [0,3] extinction exponent c for -pcm 2 (def=1.0) -pre_pca [0,1] PREFILTER pseudocount admixture (def=0.8) -pre_pcb [1,inf[ PREFILTER threshold for Neff (def=1.8) Context-specific pseudo-counts: -nocontxt use substitution-matrix instead of context-specific pseudocounts -contxt <file> context file for computing context-specific pseudocounts (default=./data/context_data.lib) -cslib <file> column state file for fast database prefiltering (default=./data/cs219.lib) Gap cost options: -gapb [0,inf[ Transition pseudocount admixture (def=1.00) -gapd [0,inf[ Transition pseudocount admixture for open gap (default=0.15) -gape [0,1.5] Transition pseudocount admixture for extend gap (def=1.00) -gapf ]0,inf] factor to increase/reduce the gap open penalty for deletes (def=0.60) -gapg ]0,inf] factor to increase/reduce the gap open penalty for inserts (def=0.60) -gaph ]0,inf] factor to increase/reduce the gap extend penalty for deletes(def=0.60) -gapi ]0,inf] factor to increase/reduce the gap extend penalty for inserts(def=0.60) -egq [0,inf[ penalty (bits) for end gaps aligned to query residues (def=0.00) -egt [0,inf[ penalty (bits) for end gaps aligned to template residues (def=0.00) Alignment options: -glob/-loc global or local alignment mode (def=global) -mac use Maximum Accuracy (MAC) alignment instead of Viterbi -mact [0,1] posterior prob threshold for MAC alignment (def=0.350) -sto <int> use global stochastic sampling algorithm to sample this many alignments -sc <int> amino acid score (tja: template HMM at column j) (def=1) 0 = log2 Sum(tja*qia/pa) (pa: aa background frequencies) 1 = log2 Sum(tja*qia/pqa) (pqa = 1/2*(pa+ta) ) 2 = log2 Sum(tja*qia/ta) (ta: av. aa freqs in template) 3 = log2 Sum(tja*qia/qa) (qa: av. aa freqs in query) -corr [0,1] weight of term for pair correlations (def=0.10) -shift [-1,1] score offset (def=-0.030) -r repeat identification: multiple hits not treated as independent -ssm 0-2 0:no ss scoring [default=2] 1:ss scoring after alignment 2:ss scoring during alignment -ssw [0,1] weight of ss score compared to column score (def=0.11) -ssa [0,1] ss confusion matrix = (1-ssa)*I + ssa*psipred-confusion-matrix [def=1.00) -calm 0-3 empirical score calibration of 0:query 1:template 2:both (def=off) Default options can be specified in './.hhdefaults' or '~/.hhdefaults'