Ubuntu Manpage: mkdssp - Calculate secondary structure for proteins in a PDB file

NAME

       mkdssp - Calculate secondary structure for proteins in a PDB file

SYNOPSIS

       mkdssp [OPTION] pdbfile [dsspfile]

DESCRIPTION

       The  mkdssp  program was originally designed by Wolfgang Kabsch and Chris Sander to standardize secondary
       structure assignment.  DSSP is a database of secondary structure assignments  (and  much  more)  for  all
       protein  entries  in  the  Protein Data Bank (PDB) and mkdssp is the application that calculates the DSSP
       entries from PDB entries.  Please note that mkdssp does not predict secondary structure.

OPTIONS

       If you invoke mkdssp with only one parameter, it will be interpreted as  the  PDB  file  to  process  and
       output  will be sent to stdout. If a second parameter is specified this is interpreted as the name of the
       DSSP file to create. Both the input and the output file names may have either .gz or  .bz2  as  extension
       resulting in the proper compression.

       -i, --input filename
              The  file  name  of a PDB formatted file containing the protein structure data. This file may be a
              file compressed by gzip or bzip2.

       -o, --output filename
              The file name of a DSSP file to create. If the filename ends in .gz or .bz2 a compressed  file  is
              created.

       -v, --verbose
              Write out diagnositic information.

       --version
              Print the version number and exit.

       -h, --help
              Print the help message and exit.  The directory containing the parser scripts for mrs.

THEORY

       The  DSSP  program  works  by  calculating  the  most  likely secondary structure assignment given the 3D
       structure of a protein. It does this by reading the position of the atoms in a protein (the ATOM  records
       in  a  PDB file) followed by calculation of the H-bond energy between all atoms. The best two H-bonds for
       each atom are then used to determine the most likely class of secondary structure for each residue in the
       protein.

       This means you do need to have a full and valid 3D structure for a protein to be able  to  calculate  the
       secondary  structure.   There's  no  magic in DSSP, so e.g. it cannot guess the secondary structure for a
       mutated protein for which you don't have the 3D structure.

DSSP FILE FORMAT

       The header part of each DSSP file is self explaining, it contains some of  the  information  copied  over
       from the PDB file and there are some statistics gathered while calculating the secondary structure.

       The  second  half  of  the file contains the calculated secondary structure information per residue. What
       follows is a brief explanation for each column.
       Column Name                       Description
       ──────────────────────────────────────────────────────────────────────────────────────────────────────────
       #                                 The residue number as counted by mkdssp
       RESIDUE                           The residue number as specified by the PDB file followed
                                         by a chain identifier.
       AA                                The one letter code for the amino acid. If  this  letter
                                         is  lower case this means this is a cysteine that form a
                                         sulfur bridge with the other amino acid in  this  column
                                         with the same lower case letter.
                                         This   is  a  complex  column  containing  multiple  sub
                                         columns.  The first column contains a letter  indicating
                                         the  secondary structure assigned to this residue. Valid
                                         values are:
                                                   Code                         Description
                                                    H                           Alpha Helix
                                                    B                           Beta Bridge
                                                    E                           Strand
                                                    G                           Helix-3
                                                    I                           Helix-5
                                                    T                           Turn
                                                    S                           Bend
                                         What follows are three column indicating for each of the
                                         three helix types (3, 4 and 5) whether this residue is a
                                         candidate in forming this helix. A > character indicates
                                         it starts a helix, a number indicates it is inside  such
                                         a helix and a < character means it ends the helix.
                                         The  next  column contains a S character if this residue
                                         is a possible bend.
                                         Then there's a column indicating the chirality and  this
                                         can  either  be  positive  or  negative  (i.e. the alpha
                                         torsion is either positive or negative).
                                         The last two columns contain beta bridge  labels.  Lower
                                         case  here  means  parallel  bridge  and thus upper case
                                         means anti parallel.

       STRUCTURE
       BP1 and BP2                       The first and second  bridge  pair  candidate,  this  is
                                         followed by a letter indicating the sheet.
       ACC                               The  accessibility  of this residue, this is the surface
                                         area expressed in square Ångstrom that can  be  accessed
                                         by a water molecule.
       N-H-->O..O-->H-N                  Four  columns,  they  give  for  each residue the H-bond
                                         energy with another residue where the current residue is
                                         either acceptor  or  donor.  Each  column  contains  two
                                         numbers, the first is an offset from the current residue
                                         to   the   partner  residue  in  this  H-bond  (in  DSSP
                                         numbering), the second number is the  calculated  energy
                                         for this H-bond.
       TCO                               The  cosine  of  the  angle  between  C=O of the current
                                         residue and C=O of previous residue. For  alpha-helices,
                                         TCO is near +1, for beta-sheets TCO is near -1. Not used
                                         for structure definition.
       Kappa                             The virtual bond angle (bend angle) defined by the three
                                         C-alpha  atoms  of the residues current - 2, current and
                                         current + 2. Used to define bend (structure code 'S').
       PHI and PSI                       IUPAC peptide backbone torsion angles.
       X-CA, Y-CA and Z-CA               The C-alpha coordinates

HISTORY

       The original DSSP application was written by Wolfgang Kabsch and Chris Sander in Pascal. This version  is
       a  complete  rewrite  in  C++ based on the original source code. A few bugs have been fixed since and the
       algorithms have been tweaked here and there.

TODO

       The code desperately needs an update. The first thing that needs implementing is the improved recognition
       of pi-helices. A second improvement would be to use angle dependent H-bond energy calculation.

BUGS

       If you find any, please let me know.

AUTHOR

       Maarten L. Hekkelman (m.hekkelman (at) cmbi.ru.nl)

version 2.0.4                                      18-apr-2012                                         mkdssp(1)