Provided by: hhsuite_3.2.0-2build1_amd64
NAME
hhsearch - search a database of HMMs with a query alignment or query HMM
SYNOPSIS
hhsearch -i query -d database [options]
DESCRIPTION
HHsearch 3.1.0 Search a database of HMMs with a query alignment or query HMM (c) The HH-suite development team Soding, J. Protein homology detection by HMM-HMM comparison. Bioinformatics 21:951-960 (2005). -i <file> input/query multiple sequence alignment (a2m, a3m, FASTA) or HMM <file> may be 'stdin' or 'stdout' throughout. Options: -d <name> database name (e.g. uniprot20_29Feb2012) Multiple databases may be specified with '-d <db1> -d <db2> ...' -e [0,1] E-value cutoff for inclusion in result alignment (def=0.001) Input alignment format: -M a2m use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted) -M first use FASTA: columns with residue in 1st sequence are match states -M [0,100] use FASTA: columns with fewer than X% gaps are match states -tags/-notags do NOT / do neutralize His-, C-myc-, FLAG-tags, and trypsin recognition sequence to background distribution (def=-notags) Output options: -o <file> write results in standard format to file (default=<infile.hhr>) -oa3m <file> write result MSA with significant matches in a3m format -blasttab <name> write result in tabular BLAST format (compatible to -m 8 or -outfmt 6 output) 1 2 3 4 5 6 8 9 10 11 12 'query target #match/tLen #mismatch #gapOpen qstart qend tstart tend eval score' -opsi <file> write result MSA of significant matches in PSI-BLAST format -ohhm <file> write HHM file for result MSA of significant matches -add_cons generate consensus sequence as master sequence of query MSA (default=don't) -hide_cons don't show consensus sequence in alignments (default=show) -hide_pred don't show predicted 2ndary structure in alignments (default=show) -hide_dssp don't show DSSP 2ndary structure in alignments (default=show) -show_ssconf show confidences for predicted 2ndary structure in alignments -Ofas <file> write pairwise alignments in FASTA xor A2M (-Oa2m) xor A3M (-Oa3m) format -seq <int> max. number of query/template sequences displayed (default=1) -aliw <int> number of columns per line in alignment list (default=80) -p [0,100] minimum probability in summary and alignment list (default=20) -E [0,inf[ maximum E-value in summary and alignment list (default=1E+06) -Z <int> maximum number of lines in summary hit list (default=500) -z <int> minimum number of lines in summary hit list (default=10) -B <int> maximum number of alignments in alignment list (default=500) -b <int> minimum number of alignments in alignment list (default=10) Filter options applied to query MSA, database MSAs, and result MSA -all show all sequences in result MSA; do not filter result MSA -id [0,100] maximum pairwise sequence identity (def=90) -diff [0,inf[ filter MSAs by selecting most diverse set of sequences, keeping at least this many seqs in each MSA block of length 50 Zero and non-numerical values turn off the filtering. (def=100) -cov [0,100] minimum coverage with master sequence (%) (def=0) -qid [0,100] minimum sequence identity with master sequence (%) (def=0) -qsc [0,100] minimum score per column with master sequence (default=-20.0) -neff [1,inf] target diversity of multiple sequence alignment (default=off) -mark do not filter out sequences marked by ">@"in their name line HMM-HMM alignment options: -norealign do NOT realign displayed hits with MAC algorithm (def=realign) -ovlp <int> banded alignment: forbid <ovlp> largest diagonals |i-j| of DP matrix (def=0) -mact [0,1[ posterior prob threshold for MAC realignment controlling greediness at alignment ends: 0:global >0.1:local (default=0.35) -glob/-loc use global/local alignment mode for searching/ranking (def=local) -realign realign displayed hits with max. accuracy (MAC) algorithm -excl <range> exclude query positions from the alignment, e.g. '1-33,97-168' -realign_max <int> realign max. <int> hits (default=500) -alt <int> show up to this many alternative alignments with raw score > smin(def=4) -smin <float> minimum raw score for alternative alignments (def=20.0) -shift [-1,1] profile-profile score offset (def=-0.03) -corr [0,1] weight of term for pair correlations (def=0.10) -sc <int> amino acid score (tja: template HMM at column j) (def=1) 0 = log2 Sum(tja*qia/pa) (pa: aa background frequencies) 1 = log2 Sum(tja*qia/pqa) (pqa = 1/2*(pa+ta) ) 2 = log2 Sum(tja*qia/ta) (ta: av. aa freqs in template) 3 = log2 Sum(tja*qia/qa) (qa: av. aa freqs in query) 5 local amino acid composition correction -ssm {0,..,4} 0: no ss scoring 1,2: ss scoring after or during alignment [default=2] 3,4: ss scoring after or during alignment, predicted vs. predicted -ssw [0,1] weight of ss score (def=0.11) -ssa [0,1] SS substitution matrix = (1-ssa)*I + ssa*full-SS-substition-matrix [def=1.00) -wg use global sequence weighting for realignment! Gap cost options: -gapb [0,inf[ Transition pseudocount admixture (def=1.00) -gapd [0,inf[ Transition pseudocount admixture for open gap (default=0.15) -gape [0,1.5] Transition pseudocount admixture for extend gap (def=1.00) -gapf ]0,inf] factor to increase/reduce gap open penalty for deletes (def=0.60) -gapg ]0,inf] factor to increase/reduce gap open penalty for inserts (def=0.60) -gaph ]0,inf] factor to increase/reduce gap extend penalty for deletes(def=0.60) -gapi ]0,inf] factor to increase/reduce gap extend penalty for inserts(def=0.60) -egq [0,inf[ penalty (bits) for end gaps aligned to query residues (def=0.00) -egt [0,inf[ penalty (bits) for end gaps aligned to template residues (def=0.00) Pseudocount (pc) options: Context specific hhm pseudocounts: -pc_hhm_contxt_mode {0,..,3} position dependence of pc admixture 'tau' (pc mode, default=2) 0: no pseudo counts: tau = 0 1: constant tau = a 2: diversity-dependent: tau = a/(1+((Neff[i]-1)/b)^c) 3: CSBlast admixture: tau = a(1+b)/(Neff[i]+b) (Neff[i]: number of effective seqs in local MSA around column i) -pc_hhm_contxt_a [0,1] overall pseudocount admixture (def=0.9) -pc_hhm_contxt_b [1,inf[ Neff threshold value for mode 2 (def=4.0) -pc_hhm_contxt_c [0,3] extinction exponent c for mode 2 (def=1.0) Context independent hhm pseudocounts (used for templates; used for query if contxt file is not available): -pc_hhm_nocontxt_mode {0,..,3} position dependence of pc admixture 'tau' (pc mode, default=2) 0: no pseudo counts: tau = 0 1: constant tau = a 2: diversity-dependent: tau = a/(1+((Neff[i]-1)/b)^c) (Neff[i]: number of effective seqs in local MSA around column i) -pc_hhm_nocontxt_a [0,1] overall pseudocount admixture (def=1.0) -pc_hhm_nocontxt_b [1,inf[ Neff threshold value for mode 2 (def=1.5) -pc_hhm_nocontxt_c [0,3] extinction exponent c for mode 2 (def=1.0) Context-specific pseudo-counts: -nocontxt use substitution-matrix instead of context-specific pseudocounts -contxt <file> context file for computing context-specific pseudocounts (default=) -csw [0,inf] weight of central position in cs pseudocount mode (def=1.6) -csb [0,1] weight decay parameter for positions in cs pc mode (def=0.9) Other options: -v <int> verbose mode: 0:no screen output 1:only warnings 2: verbose (def=2) -cpu <int> number of CPUs to use (for shared memory SMPs) (default=2) -scores <file> write scores for all pairwise comparisons to file -atab <file> write all alignments in tabular layout to file -maxseq <int> max number of input rows (def=65535) -maxres <int> max number of HMM columns (def=20001) -maxmem [1,inf[ limit memory for realignment (in GB) (def=3.0) Example: hhsearch -i a.1.1.1.a3m -d scop70_1.71 Download databases from <http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/>.