lunar (1) RNAeval.1.gz

Provided by: vienna-rna_2.5.1+dfsg-1_amd64 bug

NAME

       RNAeval - manual page for RNAeval 2.5.1

SYNOPSIS

       RNAeval [OPTIONS] [<input0>] [<input1>]...

DESCRIPTION

       RNAeval 2.5.1

       Determine  the  free energy of a (consensus) secondary structure for (an alignment of) RNA
       sequence(s)

       Evaluates the free energy of a particular  (consensus)  secondary  structure  for  an  (an
       alignment  of)  RNA  molecule(s).  The  energy  unit is kcal/mol and contains a covariance
       pseudo-energy term for multiple  sequence  alignments  (--msa  option)  and  corresponding
       consensus  structures.   The  program  will  continue to read new sequences and structures
       until a line consisting of the single character  "@"  or  an  end  of  file  condition  is
       encountered.   If the input sequence or structure contains the separator character "&" the
       program calculates the energy of the co-folding of two RNA strands, where  the  "&"  marks
       the boundary between the two strands.

       -h, --help
              Print help and exit

       --detailed-help
              Print help, including all details and hidden options, and exit

       --full-help
              Print help, including hidden options, and exit

       -V, --version
              Print version and exit

   General Options:
              Below are command line options which alter the general behavior of this program

       --noconv
              Do not automatically substitude nucleotide "T" with "U"

              (default=off)

       -v, --verbose
              Print out energy contribution of each loop in the structure.

              (default=off)

       -j, --jobs[=number]
              Split  batch  input  into  jobs  and  start  processing  in parallel using multiple
              threads. A value of 0 indicates to use as  many  parallel  threads  as  computation
              cores are available.

              (default=`0')

              Default  processing  of  input  data  is  performed  in  a serial fashion, i.e. one
              sequence at a time. Using this switch, a user can instead start the computation for
              many  sequences  in  the  input  in  parallel. RNAeval will create as many parallel
              computation slots as specified and assigns input sequences of the input file(s)  to
              the  available  slots.  Note,  that  this  increases memory consumption since input
              alignments have to be kept in memory until an empty compute slot is  available  and
              each running job requires its own dynamic programming matrices.

       --unordered
              Do  not  try  to  keep  output  in order with input while parallel processing is in
              place.

              (default=off)

              When parallel input processing (--jobs flag) is enabled, the order in  which  input
              is  processed  depends on the host machines job scheduler. Therefore, any output to
              stdout or files generated by this program will most likely not follow the order  of
              the  corresponding  input  data set. The default of RNAeval is to use a specialized
              data structure to still keep the results output  in  order  with  the  input  data.
              However,  this  comes  with  a  trade-off in terms of memory consumption, since all
              output must be kept in memory for as long as  no  chunks  of  consecutive,  ordered
              output  are  available.  By  setting  this flag, RNAeval will not buffer individual
              results but print them as soon as they have been computated.

       -i, --infile=<filename>
              Read a file instead of reading from stdin

              The default behavior of RNAeval is to read input from stdin  or  the  file(s)  that
              follow(s) the RNAeval command. Using this parameter the user can specify input file
              names where data is read from. Note, that any additional files supplied to  RNAeval
              are still processed as well.

       -a, --msa
              Input is multiple sequence alignment in Stockholm 1.0 format

              (default=off)

              Using  this  flag  indicates  that the input is a multiple sequence alignment (MSA)
              instead of (a) single sequence(s). Note, that only STOCKHOLM format allows  one  to
              specify a consensus structure. Therefore, this is the only supported MSA format for
              now!

       --auto-id
              Automatically generate an ID for each sequence.  (default=off)

              The default mode of RNAeval is to automatically determine  an  ID  from  the  input
              sequence  data if the input file format allows to do that. Sequence IDs are usually
              given in the FASTA header of input sequences.  If  this  flag  is  active,  RNAeval
              ignores any IDs retrieved from the input and automatically generates an ID for each
              sequence. This ID consists of a prefix and an increasing number. This flag can also
              be used to add a FASTA header to the output even if the input has none.

       --id-prefix=prefix
              Prefix for automatically generated IDs (as used in output file names)

              (default=`sequence')

              If  this parameter is set, each sequence will be prefixed with the provided string.
              Note: Setting this parameter implies --auto-id.

       --id-delim=delimiter
              Change the  delimiter  between  prefix  and  increasing  number  for  automatically
              generated IDs (as used in output file names)

              (default=`_')

              This parameter can be used to change the default delimiter "_" between

              the prefix string and the increasing number for automatically generated ID.

       --id-digits=INT
              Specify  the  number  of digits of the counter in automatically generated alignment
              IDs.

              (default=`4')

              When alignments IDs are automatically generated, they receive an increasing number,
              starting with 1. This number will always be left-padded by leading zeros, such that
              the number takes up a certain  width.  Using  this  parameter,  the  width  can  be
              specified  to  the  users  need.  We allow numbers in the range [1:18]. This option
              implies --auto-id.

       --id-start=LONG
              Specify the first number in automatically generated alignment IDs.

              (default=`1')

              When sequence IDs are automatically generated, they receive an  increasing  number,
              usually starting with 1. Using this parameter, the first number can be specified to
              the users requirements. Note: negative numbers are not allowed.  Note: Setting this
              parameter  implies  to  ignore  any  IDs  retrieved  from  the  input data, i.e. it
              activates the --auto-id flag.

   Model Details:
       -T, --temp=DOUBLE
              Rescale energy parameters to a temperature of temp C. Default is 37C.

       -4, --noTetra
              Do not include special tabulated stabilizing energies for tri-, tetra- and hexaloop
              hairpins. Mostly for testing.

              (default=off)

       -d, --dangles=INT
              How to treat "dangling end" energies for bases adjacent to helices in free ends and
              multi-loops

              (default=`2')

              With -d1 only unpaired bases can participate in at most one dangling end.  With -d2
              this  check is ignored, dangling energies will be added for the bases adjacent to a
              helix on both sides in any case; this is the default for mfe and partition function
              folding.   The  option -d0 ignores dangling ends altogether (mostly for debugging).
              With  -d3  mfe  folding  will  allow  coaxial  stacking  of  adjacent  helices   in
              multi-loops.  At  the  moment the implementation will not allow coaxial stacking of
              the two interior pairs in a loop of degree 3.

       -e, --energyModel=INT
              Rarely used option to fold sequences from the artificial ABCD... alphabet, where  A
              pairs B, C-D etc.  Use the energy parameters for GC (-e 1) or AU (-e 2) pairs.

       -P, --paramFile=paramfile
              Read energy parameters from paramfile, instead of using the default parameter set.

              Different  sets  of  energy  parameters  for  RNA  and  DNA  should  accompany your
              distribution.  See the RNAlib documentation for details on the  file  format.  When
              passing the placeholder file name "DNA", DNA parameters are loaded without the need
              to actually specify any input file.

       --nsp=STRING
              Allow other pairs in addition to the usual AU,GC,and GU pairs.

              Its argument is a comma separated list of additionally allowed pairs. If the  first
              character  is  a  "-"  then  AB  will imply that AB and BA are allowed pairs.  e.g.
              RNAfold -nsp -GA  will allow GA  and  AG  pairs.  Nonstandard  pairs  are  given  0
              stacking energy.

       -c, --circ
              Assume a circular (instead of linear) RNA molecule.

              (default=off)

       -g, --gquad
              Incoorporate G-Quadruplex formation into the structure prediction algorithm

              (default=off)

       --logML
              Recalculate  energies  of  structures  using  a  logarithmic  energy  function  for
              multi-loops before output.

              (default=off)

              This option does not effect  structure  generation,  only  the  energies  that  are
              printed out. Since logML lowers energies somewhat, some structures may be missing.

       --shape=SHAPE file
              Use SHAPE reactivity data in the folding recursions (does not work for PF yet)

       --shapeMethod=[D/Z/W] + [optional parameters]
              Specify the method how to convert SHAPE

       reactivity data to pseudo energy
              contributions

              (default=`D')

              The  following methods can be used to convert SHAPE reactivities into pseudo energy
              contributions.

              'D': Convert by using a linear  equation  according  to  Deigan  et  al  2009.  The
              calculated  pseudo  energies  will  be  applied  for every nucleotide involved in a
              stacked pair. This method is recognized by a capital 'D' in the provided parameter,
              i.e.: --shapeMethod="D" is the default setting. The slope 'm' and the intercept 'b'
              can be set to a non-default value if necessary,  otherwise  m=1.8  and  b=-0.6.  To
              alter  these  parameters,  e.g. m=1.9 and b=-0.7, use a parameter string like this:
              --shapeMethod="Dm1.9b-0.7". You may also provide only one  of  the  two  parameters
              like: --shapeMethod="Dm1.9" or --shapeMethod="Db-0.7".

              'Z':  Convert SHAPE reactivities to pseudo energies according to Zarringhalam et al
              2012. SHAPE reactivities will be converted to pairing probabilities by using linear
              mapping.  Aberration  from  the  observed  pairing  probabilities will be penalized
              during the folding recursion. The  magnitude  of  the  penalties  can  affected  by
              adjusting the factor beta (e.g. --shapeMethod="Zb0.8").

              'W':  Apply  a  given  vector  of  perturbation  energies  to  unpaired nucleotides
              according to Washietl et al 2012. Perturbation vectors can be calculated  by  using
              RNApvmin.

       --shapeConversion=M/C/S/L/O
              + [optional parameters] Specify the method used to convert SHAPE

       reactivities to pairing probabilities when
              using the SHAPE approach of Zarringhalam et al.

              (default=`O')

              The  following  methods  can  be  used  to  convert  SHAPE  reactivities  into  the
              probability for a certain nucleotide to be unpaired.

              'M':  Use  linear  mapping  according  to  Zarringhalam  et   al.    'C':   Use   a
              cutoff-approach  to divide into paired and unpaired nucleotides (e.g. "C0.25") 'S':
              Skip the normalizing step since the input data already represents probabilities for
              being unpaired rather than raw reactivity values 'L': Use a linear model to convert
              the reactivity into a probability for being unpaired (e.g. "Ls0.68i0.2"  to  use  a
              slope  of  0.68 and an intercept of 0.2) 'O': Use a linear model to convert the log
              of the reactivity into a probability for being unpaired (e.g. "Os1.6i-2.29" to  use
              a slope of 1.6 and an intercept of -2.29)

       --mis  Output  "most informative sequence" instead of simple consensus: For each column of
              the alignment output the set of nucleotides with frequency greater than average  in
              IUPAC notation.

              (default=off)

       --cfactor=DOUBLE
              Set the weight of the covariance term in the energy function

              (default=`1.0')

       --nfactor=DOUBLE
              Set  the  penalty for non-compatible sequences in the covariance term of the energy
              function

              (default=`1.0')

       -R, --ribosum_file=ribosumfile
              use specified Ribosum Matrix instead of normal

       energy model. Matrixes to use should be 6x6
              matrices, the order of the terms is AU, CG, GC, GU, UA, UG.

       -r, --ribosum_scoring
              use ribosum scoring matrix. The matrix is  chosen  according  to  the  minimal  and
              maximal pairwise identities of the sequences in the file.

              (default=off)

       --old  use old energy evaluation, treating gaps as characters.

              (default=off)

REFERENCES

       If you use this program in your work you might want to cite:

       R.  Lorenz, S.H. Bernhart, C. Hoener zu Siederdissen, H. Tafer, C. Flamm, P.F. Stadler and
       I.L. Hofacker (2011), "ViennaRNA Package 2.0", Algorithms for Molecular Biology: 6:26

       I.L. Hofacker, W. Fontana, P.F. Stadler, S. Bonhoeffer, M.  Tacker,  P.  Schuster  (1994),
       "Fast  Folding and Comparison of RNA Secondary Structures", Monatshefte f. Chemie: 125, pp
       167-188

       R.  Lorenz,  I.L.  Hofacker,  P.F.  Stadler  (2016),  "RNA  folding  with  hard  and  soft
       constraints", Algorithms for Molecular Biology 11:1 pp 1-13

       The energy parameters are taken from:

       D.H.  Mathews,  M.D.  Disney, D. Matthew, J.L. Childs, S.J. Schroeder, J. Susan, M. Zuker,
       D.H. Turner (2004),  "Incorporating  chemical  modification  constraints  into  a  dynamic
       programming  algorithm  for prediction of RNA secondary structure", Proc. Natl. Acad. Sci.
       USA: 101, pp 7287-7292

       D.H Turner, D.H. Mathews (2009),  "NNDB:  The  nearest  neighbor  parameter  database  for
       predicting  stability of nucleic acid secondary structure", Nucleic Acids Research: 38, pp
       280-282

AUTHOR

       Ivo L Hofacker, Peter F Stadler, Ronny Lorenz

REPORTING BUGS

       If in doubt our program is right,  nature  is  at  fault.   Comments  should  be  sent  to
       rna@tbi.univie.ac.at.