Ubuntu Manpage: mkdssp - Calculate secondary structure for proteins in a PDB file

NAME

       mkdssp - Calculate secondary structure for proteins in a PDB file

SYNOPSIS

       mkdssp [OPTION] pdbfile [dsspfile]

DESCRIPTION

       The  mkdssp  program  was  originally  designed  by  Wolfgang  Kabsch  and Chris Sander to
       standardize secondary structure assignment.  DSSP is a  database  of  secondary  structure
       assignments  (and  much  more)  for all protein entries in the Protein Data Bank (PDB) and
       mkdssp is the application that calculates the DSSP entries from PDB entries.  Please  note
       that mkdssp does not predict secondary structure.

OPTIONS

       If  you  invoke  mkdssp with only one parameter, it will be interpreted as the PDB file to
       process and output will be sent to stdout. If a second  parameter  is  specified  this  is
       interpreted  as  the  name  of the DSSP file to create. Both the input and the output file
       names may have either .gz or .bz2 as extension resulting in the proper compression.

       -i, --input filename
              The file name of a PDB formatted file containing the protein structure  data.  This
              file may be a file compressed by gzip or bzip2.

       -o, --output filename
              The  file  name  of  a  DSSP  file to create. If the filename ends in .gz or .bz2 a
              compressed file is created.

       -v, --verbose
              Write out diagnositic information.

       --version
              Print the version number and exit.

       -h, --help
              Print the help message and exit.  The directory containing the parser  scripts  for
              mrs.

THEORY

       The DSSP program works by calculating the most likely secondary structure assignment given
       the 3D structure of a protein. It does this by reading the position  of  the  atoms  in  a
       protein  (the  ATOM  records  in  a PDB file) followed by calculation of the H-bond energy
       between all atoms. The best two H-bonds for each atom are then used to determine the  most
       likely class of secondary structure for each residue in the protein.

       This  means  you do need to have a full and valid 3D structure for a protein to be able to
       calculate the secondary structure.  There's no magic in DSSP, so e.g. it cannot guess  the
       secondary structure for a mutated protein for which you don't have the 3D structure.

DSSP FILE FORMAT

       The  header part of each DSSP file is self explaining, it contains some of the information
       copied over from the PDB file and there are some statistics gathered while calculating the
       secondary structure.

       The  second  half  of the file contains the calculated secondary structure information per
       residue. What follows is a brief explanation for each column.

       Column Name                   Description
       ───────────────────────────────────────────────────────────────────────────────────────────
       #                             The residue number as counted by mkdssp
       RESIDUE                       The residue number as specified by the PDB  file
                                     followed by a chain identifier.

       AA                            The  one letter code for the amino acid. If this
                                     letter is  lower  case  this  means  this  is  a
                                     cysteine  that  form  a  sulfur  bridge with the
                                     other amino acid in this column  with  the  same
                                     lower case letter.
       STRUCTURE                     This is a complex column containing multiple sub
                                     columns.  The first  column  contains  a  letter
                                     indicating  the  secondary structure assigned to
                                     this residue. Valid values are:
                                             Code                    Description
                                               H                     Alpha Helix
                                               B                     Beta Bridge
                                               E                     Strand
                                               G                     Helix-3
                                               I                     Helix-5
                                               T                     Turn
                                               S                     Bend
                                     What follows are  three  column  indicating  for
                                     each  of  the  three  helix  types  (3, 4 and 5)
                                     whether this residue is a candidate  in  forming
                                     this  helix. A > character indicates it starts a
                                     helix, a number indicates it is  inside  such  a
                                     helix and a < character means it ends the helix.
                                     The  next  column contains a S character if this
                                     residue is a possible bend.
                                     Then there's a column indicating  the  chirality
                                     and  this  can  either  be  positive or negative
                                     (i.e. the alpha torsion is  either  positive  or
                                     negative).
                                     The last two columns contain beta bridge labels.
                                     Lower case here means parallel bridge  and  thus
                                     upper case means anti parallel.
       BP1 and BP2                   The first and second bridge pair candidate, this
                                     is followed by a letter indicating the sheet.
       ACC                           The accessibility of this residue, this  is  the
                                     surface  area  expressed in square Ångstrom that
                                     can be accessed by a water molecule.
       N-H-->O..O-->H-N              Four columns, they give  for  each  residue  the
                                     H-bond  energy  with  another  residue where the
                                     current residue is  either  acceptor  or  donor.
                                     Each  column  contains two numbers, the first is
                                     an  offset  from  the  current  residue  to  the
                                     partner   residue   in   this  H-bond  (in  DSSP
                                     numbering), the second number is the  calculated
                                     energy for this H-bond.
       TCO                           The  cosine  of  the  angle  between  C=O of the
                                     current residue and C=O of previous residue. For
                                     alpha-helices,  TCO  is near +1, for beta-sheets
                                     TCO  is  near  -1.  Not   used   for   structure
                                     definition.
       Kappa                         The  virtual  bond angle (bend angle) defined by
                                     the three C-alpha atoms of the residues  current
                                     -  2,  current  and  current + 2. Used to define
                                     bend (structure code 'S').
       PHI and PSI                   IUPAC peptide backbone torsion angles.
       X-CA, Y-CA and Z-CA           The C-alpha coordinates

HISTORY

       The original DSSP application was written by Wolfgang Kabsch and Chris Sander  in  Pascal.
       This  version  is  a complete rewrite in C++ based on the original source code. A few bugs
       have been fixed since and the algorithms have been tweaked here and there.

TODO

       The code desperately needs an update. The first  thing  that  needs  implementing  is  the
       improved  recognition  of pi-helices. A second improvement would be to use angle dependent
       H-bond energy calculation.

BUGS

       If you find any, please let me know.

AUTHOR

       Maarten L. Hekkelman (m.hekkelman (at) cmbi.ru.nl)