Provided by: hhsuite_2.0.16-1ubuntu1_amd64
NAME
hhsearch - search a database of HMMs with a query alignment or query HMM
SYNOPSIS
hhsearch -i query -d database [options]
DESCRIPTION
HHsearch version 2.0.16 (January 2013) Search a database of HMMs with a query alignment or query HMM (C) Johannes Soeding, Michael Remmert, Andreas Biegert, Andreas Hauser Soding, J. Protein homology detection by HMM-HMM comparison. Bioinformatics 21:951-960 (2005). -i <file> input/query multiple sequence alignment (a2m, a3m, FASTA) or HMM -d <file> HMM database of concatenated HMMs in hhm, HMMER, or a3m format, OR, if file has extension pal, list of HMM file names, one per line. Multiple dbs, HMMs, or pal files with -d '<db1> <db2>...' <file> may be 'stdin' or 'stdout' throughout. Output options: -o <file> write results in standard format to file (default=<infile.hhr>) -Ofas <file> write pairwise alignments of significant matches in FASTA format Analogous for output in a3m, a2m, and psi format (e.g. -Oa3m) -oa3m <file> write MSA of significant matches in a3m format Analogous for output in a2m, psi, and hhm format (e.g. -ohhm) -e [0,1] E-value cutoff for inclusion in multiple alignment (def=0.001) -seq <int> max. number of query/template sequences displayed (def=1) Beware of overflows! All these sequences are stored in memory. -cons show consensus sequence as master sequence of query MSA -nocons don't show consensus sequence in alignments (default=show) -nopred don't show predicted 2ndary structure in alignments (default=show) -nodssp don't show DSSP 2ndary structure in alignments (default=show) -ssconf show confidences for predicted 2ndary structure in alignments -p <float> minimum probability in summary and alignment list (def=20) -E <float> maximum E-value in summary and alignment list (def=1E+06) -Z <int> maximum number of lines in summary hit list (def=500) -z <int> minimum number of lines in summary hit list (def=10) -B <int> maximum number of alignments in alignment list (def=500) -b <int> minimum number of alignments in alignment list (def=10) -aliw [40,..[ number of columns per line in alignment list (def=80) -dbstrlen max length of database string to be printed in hhr file Filter query multiple sequence alignment -id [0,100] maximum pairwise sequence identity (%) (def=90) -diff [0,inf[ filter MSA by selecting most diverse set of sequences, keeping at least this many seqs in each MSA block of length 50 (def=100) -cov [0,100] minimum coverage with query (%) (def=0) -qid [0,100] minimum sequence identity with query (%) (def=0) -qsc [0,100] minimum score per column with query (def=-20.0) -neff [1,inf] target diversity of alignment (default=off) Input alignment format: -M a2m use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted) -M first use FASTA: columns with residue in 1st sequence are match states -M [0,100] use FASTA: columns with fewer than X% gaps are match states -tags do NOT neutralize His-, C-myc-, FLAG-tags, and trypsin recognition sequence to background distribution HMM-HMM alignment options: -norealign do NOT realign displayed hits with MAC algorithm (def=realign) -mact [0,1[ posterior probability threshold for MAC re-alignment (def=0.350) Parameter controls alignment greediness: 0:global >0.1:local -glob/-loc use global/local alignment mode for searching/ranking (def=local) -alt <int> show up to this many significant alternative alignments(def=2) -vit use Viterbi algorithm for searching/ranking (default) -mac use Maximum Accuracy (MAC) algorithm for searching/ranking -forward use Forward probability for searching -excl <range> exclude query positions from the alignment, e.g. '1-33,97-168' -shift [-1,1] score offset (def=-0.03) -corr [0,1] weight of term for pair correlations (def=0.10) -sc <int> amino acid score (tja: template HMM at column j) (def=1) 0 = log2 Sum(tja*qia/pa) (pa: aa background frequencies) 1 = log2 Sum(tja*qia/pqa) (pqa = 1/2*(pa+ta) ) 2 = log2 Sum(tja*qia/ta) (ta: av. aa freqs in template) 3 = log2 Sum(tja*qia/qa) (qa: av. aa freqs in query) 5 local amino acid composition correction -ssm {0,..,4} 0: no ss scoring 1,2: ss scoring after or during alignment [default=2] 3,4: ss scoring after or during alignment, predicted vs. predicted -ssw [0,1] weight of ss score compared to column score (def=0.11) -ssa [0,1] SS substitution matrix = (1-ssa)*I + ssa*full-SS-substition-matrix [def=1.00) Gap cost options: -gapb [0,inf[ Transition pseudocount admixture (def=1.00) -gapd [0,inf[ Transition pseudocount admixture for open gap (default=0.15) -gape [0,1.5] Transition pseudocount admixture for extend gap (def=1.00) -gapf ]0,inf] factor to increase/reduce the gap open penalty for deletes (def=0.60) -gapg ]0,inf] factor to increase/reduce the gap open penalty for inserts (def=0.60) -gaph ]0,inf] factor to increase/reduce the gap extend penalty for deletes(def=0.60) -gapi ]0,inf] factor to increase/reduce the gap extend penalty for inserts(def=0.60) -egq [0,inf[ penalty (bits) for end gaps aligned to query residues (def=0.00) -egt [0,inf[ penalty (bits) for end gaps aligned to template residues (def=0.00) Pseudocount (pc) options: -pcm {0,..,3} position dependence of pc admixture 'tau' (pc mode, default=2) 0: no pseudo counts: tau = 0 1: constant tau = a 2: diversity-dependent: tau = a/(1 + ((Neff[i]-1)/b)^c) (Neff[i]: number of effective seqs in local MSA around column i) 3: constant diversity pseudocounts -pca [0,1] overall pseudocount admixture (def=1.0) -pcb [1,inf[ Neff threshold value for -pcm 2 (def=1.5) -pcc [0,3] extinction exponent c for -pcm 2 (def=1.0) Context-specific pseudo-counts: -nocontxt use substitution-matrix instead of context-specific pseudocounts -contxt <file> context file for computing context-specific pseudocounts (default=/usr/lib/hhsuite/data/context_data.lib) -cslib <file> column state file for fast database prefiltering (default=/usr/lib/hhsuite/data/cs219.lib) -csw [0,inf] weight of central position in cs pseudocount mode (def=1.6) -csb [0,1] weight decay parameter for positions in cs pc mode (def=0.9) Other options: -cpu <int> number of CPUs to use (for shared memory SMPs) (default=1) -v <int> verbose mode: 0:no screen output 1:only warings 2: verbose -maxres <int> max number of HMM columns (def=15002) -maxmem [1,inf[ max available memory in GB (def=3.0) -scores <file> write scores for all pairwise comparisions to file -calm {0,..,3} empirical score calibration of 0:query 1:template 2:both default 3: neural network-based estimation of EVD params Example: hhsearch -i a.1.1.1.a3m -d scop70_1.71.hhm