Ubuntu Manpage: RNApvmin - manual page for RNApvmin 2.4.17

Provided by: vienna-rna_2.4.17+dfsg-2build2_amd64

NAME

       RNApvmin - manual page for RNApvmin 2.4.17

SYNOPSIS

       RNApvmin [options] <file.shape>

DESCRIPTION

RNApvmin 2.4.17

Calculate a perturbation vector that minimizes discripancies between predicted and
observed pairing probabilities

The program reads a RNA sequence from stdin and uses an iterative minimization process to
calculate a perturbation vector that minimizes the discripancies between predicted pairing
probabilites and observed pairing probabilities (deduced from given shape reactivities).
Experimental data is read from a given SHAPE file and normalized to pairing probabilities.
The experimental data has to be provided in a multiline plain text file where each line
has the format '[position] [nucleotide] [absolute shape reactivity]' (e.g. '3 A 0.7'). The
objective function used for the minimization may be weighted by choosing appropriate
values for sigma and tau.

The minimization progress will be written to stderr. Once the minimization has terminated,
the obtained perturbation vector is written to stdout.

-h, --help
Print help and exit

--detailed-help
Print help, including all details and hidden options, and exit

--full-help
Print help, including hidden options, and exit

-V, --version
Print version and exit

General Options:
Below are command line options which alter the general behavior of this program

-j, --numThreads=INT
Set the number of threads used for calculations.

--shapeConversion=STRING
Specify the method used to convert SHAPE reactivities to pairing probabilities.
(default=`O')

The following methods can be used to convert SHAPE reactivities into the
probability for a certain nucleotide to be unpaired.

'M': Use linear mapping according to Zarringhalam et al. 2012

'C': Use a cutoff-approach to divide into paired and unpaired nucleotides (e.g.
"C0.25")

'S': Skip the normalizing step since the input data already represents
probabilities for being unpaired rather than raw reactivity values

'L': Use a linear model to convert the reactivity into a probability for being
unpaired (e.g. "Ls0.68i0.2" to use a slope of 0.68 and an intercept of 0.2)

'O': Use a linear model to convert the log of the reactivity into a probability for
being unpaired (e.g. "Os1.6i-2.29" to use a slope of 1.6 and an intercept of -2.29)

--tauSigmaRatio=DOUBLE
Ratio of the weighting factors tau and sigma. (default=`1.0')

A high ratio will lead to a solution as close as possible to the experimental data,
while a low ratio will lead to results close to the thermodynamic prediction
without guiding pseudo energies.

--objectiveFunction=INT
The energies of the perturbation vector and the discripancies between predicted and
observed pairing probabilities contribute to the objective function. This parameter
defines, which function is used to process the contributions before summing them
up. 0 square 1 absolute. (default=`0')

--sampleSize=INT
The iterative minimization process requires to evaluate the gradient of the
objective function. (default=`1000')

A sample size of 0 leads to an analytical evaluation which scales as O(N^4).
Choosing a sample size >0 estimates the gradient by sampling the given number of
sequences from the ensemble, which is much faster.

-N, --nonRedundant
Enable non-redundant sampling strategy. (default=off)

--intermediatePath=STRING Write an output file for each iteration of the
minimization process.

Each file contains the used perturbation vector and the score of the objective
function. The number of the iteration will be appended to the given path.

--initialVector=DOUBLE
Specify the vector of initial pertubations. (default=`0')

Defines the initial perturbation vector which will be used as starting vector for
the minimization process. The value 0 results in a null vector. Every other value x
will be used to populate the initial vector with random numbers from the interval
[-x,x].

--minimizer=ENUM
Set the minimizing algorithm used for finding an appropriate perturbation vector.
(possible values="conjugate_fr", "conjugate_pr", "vector_bfgs", "vector_bfgs2",
"steepest_descent", "default" default=`default')

The default option uses a custom implementation of the gradient descent algorithms
while all other options represent various algorithms implemented in the GNU
Scientific Library. When the GNU Scientific Library can not be found, only the
default minimizer is available.

--initialStepSize=DOUBLE
The initial stepsize for the minimizer methods. (default=`0.01')

--minStepSize=DOUBLE
The minimal stepsize for the minizimer methods. (default=`1e-15')

--minImprovement=DOUBLE
The minimal improvement in the default minizimer method that has to be surpassed to
considered a new result a better one. (default=`1e-3')

--minimizerTolerance=DOUBLE
The tolerance to be used in the GSL minimizer

methods.
(default=`1e-3')

Model Details:
-S, --pfScale=DOUBLE
Set scaling factor for Boltzmann factors to prevent under/overflows.

In the calculation of the pf use scale*mfe as an estimate for the ensemble free
energy (used to avoid overflows). The default is 1.07, useful values are 1.0 to
1.2. Occasionally needed for long sequences. You can also recompile the program to
use double precision (see the README file).

-T, --temp=DOUBLE
Rescale energy parameters to a temperature in degrees centigrade. (default=`37.0')

-4, --noTetra
Do not include special tabulated stabilizing energies for tri-, tetra- and hexaloop
hairpins. (default=off)

Mostly for testing.

-d, --dangles=INT
Specify "dangling end" model for bases adjacent to helices in free ends and
multi-loops. (default=`2')

With -d1 only unpaired bases can participate in at most one dangling end. With -d2
this check is ignored, dangling energies will be added for the bases adjacent to a
helix on both sides in any case; this is the default for mfe and partition function
folding (-p). The option -d0 ignores dangling ends altogether (mostly for
debugging). With -d3 mfe folding will allow coaxial stacking of adjacent helices
in multi-loops. At the moment the implementation will not allow coaxial stacking of
the two interior pairs in a loop of degree 3 and works only for mfe folding.

Note that with -d1 and -d3 only the MFE computations will be using this setting
while partition function uses -d2 setting, i.e. dangling ends will be treated
differently.

--noLP Produce structures without lonely pairs (helices of length 1). (default=off)

For partition function folding this only disallows pairs that can only occur
isolated. Other pairs may still occasionally occur as helices of length 1.

--noGU Do not allow GU pairs. (default=off)

--noClosingGU
Do not allow GU pairs at the end of helices. (default=off)

-P, --paramFile=paramfile
Read energy parameters from paramfile, instead of using the default parameter set.

Different sets of energy parameters for RNA and DNA should accompany your
distribution. See the RNAlib documentation for details on the file format. When
passing the placeholder file name "DNA", DNA parameters are loaded without the need
to actually specify any input file.

--nsp=STRING
Allow other pairs in addition to the usual AU,GC,and GU pairs.

Its argument is a comma separated list of additionally allowed pairs. If the first
character is a "-" then AB will imply that AB and BA are allowed pairs. e.g.
RNAfold -nsp -GA will allow GA and AG pairs. Nonstandard pairs are given 0
stacking energy.

-e, --energyModel=INT
Set energy model.

Rarely used option to fold sequences from the artificial ABCD... alphabet, where A
pairs B, C-D etc. Use the energy parameters for GC (-e 1) or AU (-e 2) pairs.

--maxBPspan=INT
Set the maximum base pair span. (default=`-1')

REFERENCES

       If you use this program in your work you might want to cite:

       R.  Lorenz, S.H. Bernhart, C. Hoener zu Siederdissen, H. Tafer, C. Flamm, P.F. Stadler and
       I.L. Hofacker (2011), "ViennaRNA Package 2.0", Algorithms for Molecular Biology: 6:26

       I.L. Hofacker, W. Fontana, P.F. Stadler, S. Bonhoeffer, M.  Tacker,  P.  Schuster  (1994),
       "Fast  Folding and Comparison of RNA Secondary Structures", Monatshefte f. Chemie: 125, pp
       167-188

       R.  Lorenz,  I.L.  Hofacker,  P.F.  Stadler  (2016),  "RNA  folding  with  hard  and  soft
       constraints", Algorithms for Molecular Biology 11:1 pp 1-13

       S.  Washietl,  I.L.  Hofacker,  P.F.  Stadler,  M.  Kellis  (2012)  "RNA folding with soft
       constraints:  reconciliation  of  probing  data  and  thermodynamics  secondary  structure
       prediction" Nucl Acids Res: 40(10), pp 4261-4272

       The energy parameters are taken from:

       D.H.  Mathews,  M.D.  Disney, D. Matthew, J.L. Childs, S.J. Schroeder, J. Susan, M. Zuker,
       D.H. Turner (2004),  "Incorporating  chemical  modification  constraints  into  a  dynamic
       programming  algorithm  for prediction of RNA secondary structure", Proc. Natl. Acad. Sci.
       USA: 101, pp 7287-7292

       D.H Turner, D.H. Mathews (2009),  "NNDB:  The  nearest  neighbor  parameter  database  for
       predicting  stability of nucleic acid secondary structure", Nucleic Acids Research: 38, pp
       280-282

EXAMPLES

       RNApvmin acceptes a SHAPE file and a corresponding nucleotide sequence, which is read form
       stdin.

         RNApvmin sequence.shape < sequence.fasta > sequence.pv

       The  normalized  SHAPE  reactivity  data  has to be stored in a text file, where each line
       contains the position and the reactivity for a certain nucleotide ([position] [nucleotide]
       [SHAPE reactivity]).

         1 A 1.286
         2 U 0.383
         3 C 0.033
         4 C 0.017
         ...
         ...
         98 U 0.234
         99 G 0.885

       The  nucleotide  information in the SHAPE file is optional and will be used to cross check
       the given input sequence if present.  If SHAPE reactivities could not  be  determined  for
       every nucleotide, missing values can simply be omited.

       The progress of the minimization will be printed to stderr. Once a solution was found, the
       calculated perturbation vector will be print to stdout and can then  further  be  used  to
       constrain  RNAfold's  MFE/partition  function  calculation  by  applying  the perturbation
       energies as soft constraints.

         RNAfold --shape=sequence.pv --shapeMethod=W < sequence.fasta

AUTHOR

       Dominik Luntzer, Ronny Lorenz

REPORTING BUGS

       If in doubt our program is right,  nature  is  at  fault.   Comments  should  be  sent  to
       rna@tbi.univie.ac.at.