Ubuntu Manpage: hhsearch - search a database of HMMs with a query alignment or query HMM

Provided by: hhsuite_3.2.0-2build1_amd64

NAME

       hhsearch - search a database of HMMs with a query alignment or query HMM

SYNOPSIS

       hhsearch -i query -d database [options]

DESCRIPTION

       HHsearch  3.1.0  Search  a  database  of  HMMs with a query alignment or query HMM (c) The
       HH-suite development team Soding, J. Protein homology  detection  by  HMM-HMM  comparison.
       Bioinformatics 21:951-960 (2005).

       -i <file>
              input/query multiple sequence alignment (a2m, a3m, FASTA) or HMM

       <file> may be 'stdin' or 'stdout' throughout.  Options:

       -d <name>
              database  name  (e.g. uniprot20_29Feb2012) Multiple databases may be specified with
              '-d <db1> -d <db2> ...'

       -e     [0,1]   E-value cutoff for inclusion in result alignment (def=0.001)

   Input alignment format:
       -M a2m use A2M/A3M (default): upper case = Match; lower case = Insert;

              '-' = Delete; '.' = gaps aligned to inserts (may be omitted)

       -M first
              use FASTA: columns with residue in 1st sequence are match states

       -M [0,100]
              use FASTA: columns with fewer than X% gaps are match states

       -tags/-notags
              do NOT / do neutralize His-, C-myc-, FLAG-tags, and trypsin recognition sequence to
              background distribution (def=-notags)

   Output options:
       -o <file>
              write results in standard format to file (default=<infile.hhr>)

       -oa3m <file>
              write result MSA with significant matches in a3m format

       -blasttab  <name>  write  result  in tabular BLAST format (compatible to -m 8 or -outfmt 6
              output)

       1      2      3           4         5        6      8    9      10   11   12

              'query target #match/tLen #mismatch #gapOpen qstart qend tstart tend eval score'

       -opsi <file>
              write result MSA of significant matches in PSI-BLAST format

       -ohhm <file>
              write HHM file for result MSA of significant matches

       -add_cons
              generate consensus sequence as master sequence of query MSA (default=don't)

       -hide_cons
              don't show consensus sequence in alignments (default=show)

       -hide_pred
              don't show predicted 2ndary structure in alignments (default=show)

       -hide_dssp
              don't show DSSP 2ndary structure in alignments (default=show)

       -show_ssconf
              show confidences for predicted 2ndary structure in alignments

       -Ofas <file>
              write pairwise alignments in FASTA xor A2M (-Oa2m) xor A3M (-Oa3m) format

       -seq <int>
              max. number of query/template sequences displayed (default=1)

       -aliw <int>
              number of columns per line in alignment list (default=80)

       -p [0,100]
              minimum probability in summary and alignment list (default=20)

       -E [0,inf[
              maximum E-value in summary and alignment list (default=1E+06)

       -Z <int>
              maximum number of lines in summary hit list (default=500)

       -z <int>
              minimum number of lines in summary hit list (default=10)

       -B <int>
              maximum number of alignments in alignment list (default=500)

       -b <int>
              minimum number of alignments in alignment list (default=10)

       Filter options applied to query MSA, database MSAs, and result MSA

       -all   show all sequences in result MSA; do not filter result MSA

       -id    [0,100]  maximum pairwise sequence identity (def=90)

       -diff [0,inf[
              filter MSAs by selecting most diverse set of sequences, keeping at least this  many
              seqs  in  each  MSA  block  of length 50 Zero and non-numerical values turn off the
              filtering. (def=100)

       -cov   [0,100]  minimum coverage with master sequence (%) (def=0)

       -qid   [0,100]  minimum sequence identity with master sequence (%) (def=0)

       -qsc   [0,100]  minimum score per column with master sequence (default=-20.0)

       -neff [1,inf]
              target diversity of multiple sequence alignment (default=off)

       -mark  do not filter out sequences marked by ">@"in their name line

   HMM-HMM alignment options:
       -norealign
              do NOT realign displayed hits with MAC algorithm (def=realign)

       -ovlp <int>
              banded alignment: forbid <ovlp> largest diagonals |i-j| of DP matrix (def=0)

       -mact [0,1[
              posterior prob threshold for MAC realignment controlling  greediness  at  alignment
              ends: 0:global >0.1:local (default=0.35)

       -glob/-loc
              use global/local alignment mode for searching/ranking (def=local)

       -realign
              realign displayed hits with max. accuracy (MAC) algorithm

       -excl <range>
              exclude query positions from the alignment, e.g. '1-33,97-168'

       -realign_max <int>
              realign max. <int> hits (default=500)

       -alt <int>
              show up to this many alternative alignments with raw score > smin(def=4)

       -smin <float>
              minimum raw score for alternative alignments (def=20.0)

       -shift [-1,1]
              profile-profile score offset (def=-0.03)

       -corr [0,1]
              weight of term for pair correlations (def=0.10)

       -sc    <int>         amino acid score         (tja: template HMM at column j) (def=1)

       0      = log2 Sum(tja*qia/pa)   (pa: aa background frequencies)

       1      = log2 Sum(tja*qia/pqa)  (pqa = 1/2*(pa+ta) )

       2      = log2 Sum(tja*qia/ta)   (ta: av. aa freqs in template)

       3      = log2 Sum(tja*qia/qa)   (qa: av. aa freqs in query)

       5      local amino acid composition correction

       -ssm {0,..,4}
              0:   no ss scoring

       1,2: ss scoring after or during alignment
              [default=2]

              3,4: ss scoring after or during alignment, predicted vs. predicted

       -ssw [0,1]
              weight of ss score  (def=0.11)

       -ssa [0,1]
              SS substitution matrix = (1-ssa)*I + ssa*full-SS-substition-matrix [def=1.00)

       -wg    use global sequence weighting for realignment!

   Gap cost options:
       -gapb [0,inf[
              Transition pseudocount admixture (def=1.00)

       -gapd [0,inf[
              Transition pseudocount admixture for open gap (default=0.15)

       -gape [0,1.5]
              Transition pseudocount admixture for extend gap (def=1.00)

       -gapf ]0,inf]
              factor to increase/reduce gap open penalty for deletes (def=0.60)

       -gapg ]0,inf]
              factor to increase/reduce gap open penalty for inserts (def=0.60)

       -gaph ]0,inf]
              factor to increase/reduce gap extend penalty for deletes(def=0.60)

       -gapi ]0,inf]
              factor to increase/reduce gap extend penalty for inserts(def=0.60)

       -egq   [0,inf[  penalty (bits) for end gaps aligned to query residues (def=0.00)

       -egt   [0,inf[  penalty (bits) for end gaps aligned to template residues (def=0.00)

   Pseudocount (pc) options:
              Context specific hhm pseudocounts:

       -pc_hhm_contxt_mode {0,..,3}
              position dependence of pc admixture 'tau' (pc mode, default=2)

       0: no pseudo counts:
              tau = 0

       1: constant
              tau = a

              2: diversity-dependent: tau = a/(1+((Neff[i]-1)/b)^c) 3: CSBlast admixture:   tau =
              a(1+b)/(Neff[i]+b) (Neff[i]: number of effective seqs in local MSA around column i)

       -pc_hhm_contxt_a
              [0,1]        overall pseudocount admixture (def=0.9)

       -pc_hhm_contxt_b
              [1,inf[      Neff threshold value for mode 2 (def=4.0)

       -pc_hhm_contxt_c
              [0,3]        extinction exponent c for mode 2 (def=1.0)

              Context independent hhm pseudocounts (used for templates; used for query if  contxt
              file is not available):

       -pc_hhm_nocontxt_mode {0,..,3}
              position dependence of pc admixture 'tau' (pc mode, default=2)

       0: no pseudo counts:
              tau = 0

       1: constant
              tau = a

              2: diversity-dependent: tau = a/(1+((Neff[i]-1)/b)^c) (Neff[i]: number of effective
              seqs in local MSA around column i)

       -pc_hhm_nocontxt_a
              [0,1]        overall pseudocount admixture (def=1.0)

       -pc_hhm_nocontxt_b
              [1,inf[      Neff threshold value for mode 2 (def=1.5)

       -pc_hhm_nocontxt_c
              [0,3]        extinction exponent c for mode 2 (def=1.0)

              Context-specific pseudo-counts:

       -nocontxt
              use substitution-matrix instead of context-specific pseudocounts

       -contxt <file> context file for computing context-specific pseudocounts (default=)

       -csw   [0,inf]  weight of central position in cs pseudocount mode (def=1.6)

       -csb   [0,1]    weight decay parameter for positions in cs pc mode (def=0.9)

   Other options:
       -v <int>
              verbose mode: 0:no screen output  1:only warnings  2: verbose (def=2)

       -cpu <int>
              number of CPUs to use (for shared memory SMPs) (default=2)

       -scores <file> write scores for all pairwise comparisons to file

       -atab  <file> write all alignments in tabular layout to file

       -maxseq <int>
              max number of input rows (def=65535)

       -maxres <int>
              max number of HMM columns (def=20001)

       -maxmem [1,inf[ limit memory for realignment (in GB) (def=3.0)

       Example: hhsearch -i a.1.1.1.a3m -d scop70_1.71

       Download                                  databases                                   from
       <http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/>.