Provided by: profphd_1.0.42-2_all bug

NAME

       prof - secondary structure and solvent accessibility predictor

SYNOPSIS

       prof [INPUTFILE+] [OPTIONS]

DESCRIPTION

       Secondary structure is predicted by a system of neural networks rating at an expected average accuracy >
       72% for the three states helix, strand and loop (Rost & Sander, PNAS, 1993 , 90, 7558-7562; Rost &
       Sander, JMB, 1993 , 232, 584-599; and Rost & Sander, Proteins, 1994 , 19, 55-72; evaluation of accuracy).
       Evaluated on the same data set, PROFsec is rated at ten percentage points higher three-state accuracy
       than methods using only single sequence information, and at more than six percentage points higher than,
       e.g., a method using alignment information based on statistics (Levin, Pascarella, Argos & Garnier, Prot.
       Engng., 6, 849-54, 1993).  PHDsec predictions have three main features:

       1. improved accuracy through evolutionary information from multiple sequence alignments
       2. improved beta-strand prediction through a balanced training procedure
       3. more accurate prediction of secondary structure segments by using a multi-level system

       Solvent  accessibility  is  predicted  by  a  neural  network  method rating at a correlation coefficient
       (correlation between experimentally observed and predicted relative solvent accessibility) of 0.54 cross-
       validated on a set of 238 globular proteins (Rost & Sander, Proteins, 1994, 20,  216-226;  evaluation  of
       accuracy).  The  output of the neural network codes for 10 states of relative accessibility. Expressed in
       units of the difference between prediction by homology modelling (best method) and prediction  at  random
       (worst  method), PROFacc is some 26 percentage points superior to a comparable neural network using three
       output states (buried, intermediate, exposed) and using no information from multiple alignments.

       Transmembrane helices in integral membrane proteins are predicted by a system  of  neural  networks.  The
       shortcoming  of  the  network  system  is  that often too long helices are predicted. These are cut by an
       empirical filter. The final prediction (Rost et al., Protein Science, 1995,  4,  521-533;  evaluation  of
       accuracy)  has  an  expected  per-residue  accuracy  of  about  95%. The number of false positives, i.e.,
       transmembrane helices predicted in globular proteins, is about 2%.   The  neural  network  prediction  of
       transmembrane  helices  (PHDhtm) is refined by a dynamic programming-like algorithm. This method resulted
       in correct predictions of all transmembrane helices for  89%  of  the  131  proteins  used  in  a  cross-
       validation  test; more than 98% of the transmembrane helices were correctly predicted. The output of this
       method is used to predict topology, i.e., the orientation of the N-term with respect to the membrane. The
       expected accuracy of the topology prediction is > 86%. Prediction accuracy is  higher  than  average  for
       eukaryotic  proteins  and lower than average for prokaryotes. PHDtopology is more accurate than all other
       methods tested on identical data sets.

       If no output file option (such as --fileRdb or --fileOut) is given the RDB formatted  output  is  written
       into  ./INPUTFILENAME.prof  where  'prof' replaces the extension of the input file.  In lack of extension
       '.prof' is appended to the input file name.

   Output format
       The RDB format is self-annotating, see example outputs in /share/profphd/prof/exa.

REFERENCES

       Rost, B. and Sander, C. (1994a). Combining evolutionary information and neural networks to predict
       protein secondary structure. Proteins, 19(1), 55-72.
       Rost, B. and Sander, C. (1994b). Conservation and prediction of solvent accessibility in protein
       families. Proteins, 20(3), 216-26.
       Rost, B., Casadio, R., Fariselli, P., and Sander, C. (1995). Transmembrane helices predicted at 95%
       accuracy. Protein Sci, 4(3), 521-33.

OPTIONS

       See each keyword for more help.  Most of these are likely to be broken.

       a   alternative connectivity patterns (default=3)

       3   predict sec + acc + htm

       acc predict solvent accessibility, only

       ali add alignment to 'human-readable' PROF output file(s)

       arch
           system architecture (e.g.: SGI64|SGI5|SGI32|SUNMP|ALPHA)

       ascii
           write 'human-readable' PROF output file(s)

       best
           PROF with best accuracy and longest run-time

       both
           predict secondary structure and solvent accessibility

       data
           data=<all|brief|normal|detail>  for HTML out: only those parts of predictions written

       debug
           keep most intermediate files, print debugging messages

       dirWork
           work directory, default: a temporary directory from  File::Temp::tempdir.  Must  be  fully  qualified
           path.

           Known to work.

       doEval
           DO evaluation for list (only for known structures and lists)

       doFilterHssp
           filter the input HSSP file       (excluding some pairs)

       doHtmfil
           DO filter the membrane prediction                  (default)

       doHtmisit
           DO check strength of predicted membrane helix      (default)

       doHtmref
           DO refine the membrane prediction                  (default)

       doHtmtop
           DO membrane helix topology                         (default)

       dssp
           convert PROF into DSSP format

       expand
           expand insertions when converting output to MSF format

       fast
           PROF with lowest accuracy and highest speed

       fileCasp
           name of PROF output in CASP format              (file.caspProf)

       fileDssp
           name of PROF output in DSSP format              (file.dsspProf)

       fileHtml
           name of PROF output in HTML format              (file.htmlProf)

       fileMsf
           name of PROF output in MSF format               (file.msfProf)

       fileNotHtm
           name of file flagging that no membrane helix was found

       fileOut
           name of PROF output in RDB format               (file.rdbProf)

           Known to work.

       fileProf
           name of PROF output in human readable format    (file.prof)

           Broken.

       fileRdb
           name of PROF output in RDB format               (file.rdbProf)

           Known to work.

       fileSaf
           name of PROF output in SAF format               (file.safProf)

       filter
           filter the input HSSP file       (excluding some pairs)

       good
           PROF with good accuracy and moderate speed

       graph
           add ASCII graph to 'human-readable' PROF output file(s)

       htm use:  'htm=<N|0.N>'  gives  minimal  transmembrane  helix detected default is 'htm=8' (resp. htm=0.8)
           smaller numbers more false positives and fewer false negatives!

       html   argument
           'hmtl' or 'html=<all|body|head>' write HTML format of prediction 'html' will result in that the  PROF
           output  is  converted  to  HTML 'html=body' restricts HTML file to the HTML_BODY tag part 'html=head'
           restricts HTML file to the HTML_HEADER tag part 'html=all'  gives both HEADER and BODY

       keepConv
           keep the conversion of the input file to HSSP format

       keepFilter argument
           <*|doKeepFilter=1>     keep the filtered HSSP file

       keepHssp  argument
           <*|doKeepHssp=1>         keep the intermediate HSSP file

       keepNetDb argument
           <*|doKeepNetDb=1>       keep the intermediate DbNet file(s)

       list argument
           <*|isList=1>      input file is list of files

       msf convert PROF into MSF format

       nice
           give 'nice-D' to set the nice value (priority) of the job

       noProfHead
           do NOT copy file with tables into local directory

       noSearch
           short for doSearchFile=0, i.e. no searching of DB files

       noascii
           surpress writing ASCII (i.e. human readable) result files

       nohtml
           surpress writing HTML result files

       nonice
           job will not be niced, i.e. not run with lower priority

       notEval
           DO NOT check accuracy even when known structures

       notHtmfil
           do NOT filter the membrane prediction

       notHtmisit
           do NOT check whether or not membrane helix strong enough

       notHtmref
           do NOT refine the membrane prediction

       notHtmtop
           do NOT membrane helix topology

       nresPerLineAli
           Number of characters used for MSF file. Default: 50.

       numresMin
           Minimal number of residues to run network, otherwise prd=symbolPrdShort. Default: 9.

       optJury
           Adds PHD to jury. Default: `normal,usePHD'.

           Many other  parameters  change  the  default  for  this  one  as  a  side-effect,  the  list  is  not
           comprehensive:

           phd, nophd, /^para(3|Both|Sec|Acc|Htm|CapH|CapE|CapHE)/, /^para?/, jct

       para3
           Parameter file for sec+acc+htm. Default: `<DIRPROF>/net/PROFboth_best.par'.

       paraAcc
           Parameter file for acc. Default: `<DIRPROF>/net/PROFacc_best.par'.

       paraBoth
           Parameter file for sec+acc. Default: `<DIRPROF>/net/PROFboth_best.par'.

       paraSec
           Parameter file for sec. Default: `<DIRPROF>/net/PROFsec_best.par'.

       riSubAcc
           Minimal reliability index (RI) for subset PROFacc. Default: 4.

       riSubSec
           Minimal reliability index (RI) for subset PROFsec. Default: 5.

       riSubSym
           Symbol for residues predicted with RI < riSubSec/Acc. Default: `.'.

       s_k_i_p
           problems, manual, hints, notation, txt, known, DONE, Date, date, aa, Lhssp, numaa, code

       saf convert PROF into SAF format

       scrAddHelp
       scrGoal
           neural network switching

       scrHelpTxt
           Input file formats accepted:       hssp,dssp,msf,saf,fastamul,pirmul,fasta,pir,gcg,swiss

       scrIn
           list_of_files (or single file) parameter_file

       scrName
           prof

       scrNarg
           2

       sec predict secondary structure,   only

       silent
           no information written to screen - this is the default

       skipMissing
           do not abort if input file missing!

       sourceFile
           prof

       test
           is just a test (faster)

       translate-jobid-in-param-values
           String 'jobid' gets substituted with $par{jobid}

       tst quick run through program, low accuracy

       user
           user name

       --version
           Print version

AUTHOR

       B. Rost, Sander C, Fariselli P, Casadio R, Liu J, Yachdav G, Kajan L.

EXAMPLES

       Prediction from alignment in HSSP file for best results
            prof /share/profphd/prof/exa/1ppt.hssp fileRdb=/tmp/1ppt.hssp.prof

       Prediction from a single sequence
            prof /share/profphd/prof/exa/1ppt.f fileRdb=/tmp/1ppt.f.rdbProf

       phd.pl invocation
            /share/profphd/prof/embl/phd.pl /share/profphd/prof/exa/1ppt.hssp htm fileOutPhd=/tmp/query.phdPred  fileOutRdb=/tmp/query.phdRdb  fileNotHtm=/tmp/query.phdNotHtm

ENVIRONMENT

       PROFPHDDIR
           Override package prof package dir /share/profphd.

       RGUTILSDIR
           Override location of librg-utils-perl /share/librg-utils-perl.

FILES

       *.rdbProf
           default output file extension

       /share/profphd/prof
           default data directory

BUGS

       Please report bugs at <https://rostlab.org/bugzilla3/enter_bug.cgi?product=profphd>.

       Prediction from HSSP file fails when residue lines with exclamation marks `!' are present:
           Use 'optJury=normal' and 'both' like this:

            prof /tmp/1a3q.hssp fileRdb=/tmp/1a3q.hssp.profRdb optJury=normal both

SEE ALSO

       Main website
           <http://www.predictprotein.org/>

       Documentation
           <http://www.predictprotein.org/docs.php>

       Community website
           <http://groups.google.com/group/PredictProtein>

       FTP <ftp://rostlab.org/pub/cubic/downloads/prof>

       Newsgroups
           <http://groups.google.com/group/PredictProtein>

1.0.42                                             2017-11-16                                            PROF(1)