bionic (1) pfsearchV2.1.gz

Provided by: pftools_3+dfsg-2build1_amd64 bug

NAME

       pfsearchV2 - search a protein or DNA sequence library for sequence segments matching a profile

SYNOPSIS

       pfsearchV2 [ -abflLrsuxyz ] [ profile-file | - ]
                   [ seq-library-file | - ]    [C=#] [W=#]

DESCRIPTION

       pfsearchV2 compares a query profile against a DNA or protein sequence library.  The result is an unsorted
       list of profile-sequence matches written to the standard output.  A variety of output formats  containing
       different  information  can  be  specified  via  the  options -a, -l, -L, -r, -u, -s, -x, -y, -z, and the
       command-line parameter C=#.   profile-file  contains  a  profile  in  PROSITE  format.   seq-library-file
       contains  a  sequence  library  in EMBL/SWISS-PROT format (assumed by default) or in Pearson/Fasta format
       (indicated by option -f).  pfsearchV2 can be used as a filter if - is used instead of one  of  the  input
       filenames.

OPTIONS

       -a     Report  optimal  alignment  scores for all sequences regardless of the cut-off value.  This option
              simultaneously forces DISJOINT=UNIQUE.

       -b     Search the complementary strands of DNA sequences as well.

       -f     Input sequence-library is in Pearson/Fasta format.

       -l     Indicate by number the highest cut-off level exceeded by the match score in the output list.

       -L     Indicate by character string the highest cut-off level exceeded by the match score in  the  output
              list.  Note that the generalized profile format includes a text string field to specify a name for
              a cut-off level. The -L option causes the program to display the first two characters of this text
              string (usually something like "!", "?", "??", etc.) at the beginning of each match description.

       -r     Use  raw  scores  rather than normalized scores for match selection. Normalized scores will not be
              listed in the output.

       -s     List the sequences of the matched regions as well.  The output will be  a  Pearson/Fasta-formatted
              sequence library.

       -u     Forces DISJOINT=UNIQUE.

       -x     List profile-sequence alignments in pftools PSA format.

       -y     Display  alignments  between  the  profile  and  the  matched sequence regions in a human-friendly
              format.

       -z     Indicate starting and ending position of the matched profile range. The latter  position  will  be
              given as a negative offset from the end of the profile. Thus the range [    1,    -1] means entire
              profile.

PARAMETERS

       C=#    Cut-off value.  Over-writes the level zero cut-off value specified in  the  profile.   An  integer
              argument  is  interpreted as a raw score value, a decimal argument as a normalized score value. An
              integer value forces option -r.

       W=#    Output width.  Output lines will be truncated after W characters.  Default: W=132.

EXAMPLES

       (1)    pfsearchV2 -f sh3.prf sh3.seq C=6.0

              Searches the Pearson/Fasta-formatted protein sequence library sh3.seq for SH3 domains with a  cut-
              off  value  of  6.0  normalized  score  units.   sh3.seq contains 20 SH3 domain-containing protein
              sequences from SWISS-PROT release 32.  sh3.prf contains the PROSITE entry SH3/PS50002.

       (2)    pfsearchV2 -bx ecp.prf CVPBR322 | psa2msa -du | readseq -p -fMSF > ecp.msf

              Generates a multiple sequence alignment  of potential E. coli promoters on both strands of plasmid
              pBR322.   ecp.prf  contains  a  profile  for  E.  coli  promoters.   CVPBR322  contains EMBL entry
              J01749|CVPBR322.  The result file ecp.msf can further be processed by GCG programs  accepting  MSF
              files as input.

              See also manual pages of psa2msa.

AUTHOR

       Philipp Bucher
       Philipp.Bucher@isrec.unil.ch