Provided by: dssp_3.0.0-2_amd64 

NAME
mkdssp - Calculate secondary structure for proteins in a PDB file
SYNOPSIS
mkdssp [OPTION] pdbfile [dsspfile]
DESCRIPTION
The mkdssp program was originally designed by Wolfgang Kabsch and Chris Sander to standardize secondary
structure assignment. DSSP is a database of secondary structure assignments (and much more) for all
protein entries in the Protein Data Bank (PDB) and mkdssp is the application that calculates the DSSP
entries from PDB entries. Please note that mkdssp does not predict secondary structure.
OPTIONS
If you invoke mkdssp with only one parameter, it will be interpreted as the PDB file to process and
output will be sent to stdout. If a second parameter is specified this is interpreted as the name of the
DSSP file to create. Both the input and the output file names may have either .gz or .bz2 as extension
resulting in the proper compression.
-i, --input filename
The file name of a PDB formatted file containing the protein structure data. This file may be a
file compressed by gzip or bzip2.
-o, --output filename
The file name of a DSSP file to create. If the filename ends in .gz or .bz2 a compressed file is
created.
-v, --verbose
Write out diagnositic information.
--version
Print the version number and exit.
-h, --help
Print the help message and exit. The directory containing the parser scripts for mrs.
THEORY
The DSSP program works by calculating the most likely secondary structure assignment given the 3D
structure of a protein. It does this by reading the position of the atoms in a protein (the ATOM records
in a PDB file) followed by calculation of the H-bond energy between all atoms. The best two H-bonds for
each atom are then used to determine the most likely class of secondary structure for each residue in the
protein.
This means you do need to have a full and valid 3D structure for a protein to be able to calculate the
secondary structure. There's no magic in DSSP, so e.g. it cannot guess the secondary structure for a
mutated protein for which you don't have the 3D structure.
DSSP FILE FORMAT
The header part of each DSSP file is self explaining, it contains some of the information copied over
from the PDB file and there are some statistics gathered while calculating the secondary structure.
The second half of the file contains the calculated secondary structure information per residue. What
follows is a brief explanation for each column.
Column Name Description
──────────────────────────────────────────────────────────────────────────────────────────────────────────
# The residue number as counted by mkdssp
RESIDUE The residue number as specified by the PDB file followed
by a chain identifier.
AA The one letter code for the amino acid. If this letter
is lower case this means this is a cysteine that form a
sulfur bridge with the other amino acid in this column
with the same lower case letter.
This is a complex column containing multiple sub
columns. The first column contains a letter indicating
the secondary structure assigned to this residue. Valid
values are:
Code Description
H Alpha Helix
B Beta Bridge
E Strand
G Helix-3
I Helix-5
T Turn
S Bend
What follows are three column indicating for each of the
three helix types (3, 4 and 5) whether this residue is a
candidate in forming this helix. A > character indicates
it starts a helix, a number indicates it is inside such
a helix and a < character means it ends the helix.
The next column contains a S character if this residue
is a possible bend.
Then there's a column indicating the chirality and this
can either be positive or negative (i.e. the alpha
torsion is either positive or negative).
The last two columns contain beta bridge labels. Lower
case here means parallel bridge and thus upper case
means anti parallel.
STRUCTURE
BP1 and BP2 The first and second bridge pair candidate, this is
followed by a letter indicating the sheet.
ACC The accessibility of this residue, this is the surface
area expressed in square Ångstrom that can be accessed
by a water molecule.
N-H-->O..O-->H-N Four columns, they give for each residue the H-bond
energy with another residue where the current residue is
either acceptor or donor. Each column contains two
numbers, the first is an offset from the current residue
to the partner residue in this H-bond (in DSSP
numbering), the second number is the calculated energy
for this H-bond.
TCO The cosine of the angle between C=O of the current
residue and C=O of previous residue. For alpha-helices,
TCO is near +1, for beta-sheets TCO is near -1. Not used
for structure definition.
Kappa The virtual bond angle (bend angle) defined by the three
C-alpha atoms of the residues current - 2, current and
current + 2. Used to define bend (structure code 'S').
PHI and PSI IUPAC peptide backbone torsion angles.
X-CA, Y-CA and Z-CA The C-alpha coordinates
HISTORY
The original DSSP application was written by Wolfgang Kabsch and Chris Sander in Pascal. This version is
a complete rewrite in C++ based on the original source code. A few bugs have been fixed since and the
algorithms have been tweaked here and there.
TODO
The code desperately needs an update. The first thing that needs implementing is the improved recognition
of pi-helices. A second improvement would be to use angle dependent H-bond energy calculation.
BUGS
If you find any, please let me know.
AUTHOR
Maarten L. Hekkelman (m.hekkelman (at) cmbi.ru.nl)
version 2.0.4 18-apr-2012 mkdssp(1)