lunar (1) RNALfold.1.gz

Provided by: vienna-rna_2.5.1+dfsg-1_amd64 bug

NAME

       RNALfold - manual page for RNALfold 2.5.1

SYNOPSIS

       RNALfold [OPTION]...

DESCRIPTION

       RNALfold 2.5.1

       calculate locally stable secondary structures of RNAs

       Compute  locally  stable  RNA  secondary  structure  with a maximal base pair span.  For a
       sequence of length n and a base pair span of L the algorithm uses only O(n+L*L) memory and
       O(n*L*L)  CPU  time.  Thus  it  is  practical  to  "scan" very large genomes for short RNA
       structures.  Output consists of a list of secondary structure components of size <= L, one
       entry  per  line.  Each  output  line contains the predicted local structure its energy in
       kcal/mol and the starting position of the local structure.

       -h, --help
              Print help and exit

       --detailed-help
              Print help, including all details and hidden options, and exit

       --full-help
              Print help, including hidden options, and exit

       -V, --version
              Print version and exit

   General Options:
              Below are command line options which alter the general behavior of this program

       -v, --verbose
              Be verbose

              (default=off)

       -L, --span=INT
              Set the maximum distance between any two pairing nucleotides.

              (default=`150')

              This option specifies the window length L and therefore the  upper  limit  for  the
              distance between the bases i and j of any pair (i, j), i.e. (j - i + 1) <= L.

       --noconv
              Do not automatically substitude nucleotide "T" with "U"

              (default=off)

       -o, --outfile[=<filename>]
              Print output to file instead of stdout

              This option may be used to write all output to output files rather than printing to
              stdout. The number of output files created for  batch  input  (multiple  sequences)
              depends on three conditions: (i) In case an optional filename is given as parameter
              argument, a single file with the specified filename will be written  into.  If  the
              optional  argument  is omitted, (ii) FASTA input or an active --auto-id switch will
              write to multiple  files  that  follow  the  naming  scheme  "prefix.lfold".  Here,
              "prefix"  is  taken  from the sequence id as specified in the FASTA header. Lastly,
              (iii) single-line sequence input without FASTA header will be written to  a  single
              file  "RNALfold_output.lfold". In case an output file already exists, any output of
              the program will be appended to it.  Since the filename argument  is  optional,  it
              must  immediately  follow the short option flag to not be mistaken as new parameter
              to the program. For instance \'-ornafold.out\' will write to a file  "rnafold.out".
              Note:  Any  special  characters  in  the  filename will be replaced by the filename
              delimiter, hence there is no way to pass an  entire  directory  path  through  this
              option yet. (See also the "--filename-delim" parameter)

       -i, --infile=<filename>
              Read a file instead of reading from stdin

              The  default behavior of RNALfold is to read input from stdin. Using this parameter
              the user can specify an input file name where data is read from.

       --auto-id
              Automatically generate an ID for each sequence.  (default=off)

              The default mode of RNALfold is to automatically determine an  ID  from  the  input
              sequence  data if the input file format allows to do that. Sequence IDs are usually
              given in the FASTA header of input sequences. If  this  flag  is  active,  RNALfold
              ignores any IDs retrieved from the input and automatically generates an ID for each
              sequence. This ID consists of a prefix and an increasing number. This flag can also
              be used to add a FASTA header to the output even if the input has none.

       --id-prefix=prefix
              Set prefix for automatically generated IDs (default=`sequence')

              If  this parameter is set, each sequence will be prefixed with the provided string.
              Hence, the output files will obey the following naming scheme:  "prefix_xxxx.lfold"
              where xxxx is the sequence number. Note: Setting this parameter implies --auto-id.

       --id-delim=delimiter
              Change prefix delimiter for automatically generated ids.

              (default=`_')

              This parameter can be used to change the default delimiter "_" between

              the prefix string and the increasing number for automatically generated IDs

       --id-digits=INT
              Specify  the  number  of digits of the counter in automatically generated alignment
              IDs.

              (default=`4')

              When alignments IDs are automatically generated, they receive an increasing number,
              starting with 1. This number will always be left-padded by leading zeros, such that
              the number takes up a certain  width.  Using  this  parameter,  the  width  can  be
              specified  to  the  users  need.  We allow numbers in the range [1:18]. This option
              implies --auto-id.

       --id-start=LONG
              Specify the first number in automatically generated alignment IDs.

              (default=`1')

              When sequence IDs are automatically generated, they receive an  increasing  number,
              usually starting with 1. Using this parameter, the first number can be specified to
              the users requirements. Note: negative numbers are not allowed.  Note: Setting this
              parameter  implies  to  ignore  any  IDs  retrieved  from  the  input data, i.e. it
              activates the --auto-id flag.

       --filename-delim=delimiter
              Change the delimiting character that is used

              for sanitized filenames

              (default=`ID-delimiter')

              This parameter can be used to change the delimiting character used while sanitizing
              filenames,  i.e.  replacing  invalid  characters.  Note, that the default delimiter
              ALWAYS is the first character  of  the  "ID  delimiter"  as  supplied  through  the
              --id-delim  option.  If  the  delimiter is a whitespace character or empty, invalid
              characters will be simply removed rather than substituted. Currently, we regard the
              following  characters  as  illegal  for use in filenames: backslash '\', slash '/',
              question mark '?', percent sign '%', asterisk '*',  colon  ':',  pipe  symbol  '|',
              double quote '"', triangular brackets '<' and '>'.

       --filename-full
              Use full FASTA header to create filenames

              (default=off)

              This  parameter  can  be used to deactivate the default behavior of limiting output
              filenames to the first word of the sequence ID. Consider the following example:  An
              input  with  FASTA header ">NM_0001 Homo Sapiens some gene" usually produces output
              files with the prefix "NM_0001" without the additional data available in the  FASTA
              header,  e.g.  "NM_0001.lfold".  With  this  flag  set, no truncation of the output
              filenames is performed, i.e.  output filenames receive the full FASTA  header  data
              as  prefixes.  Note,  however, that invalid characters (such as whitespace) will be
              substituted by a delimiting character or simply removed, (see  also  the  parameter
              option --filename-delim).

       --commands=<filename>
              Read additional commands from file

              Commands  include  hard  and soft constraints, but also structure motifs in hairpin
              and interior loops that need to be treeted differently. Furthermore,  commands  can
              be set for unstructured and structured domains.

   Algorithms:
              Select  additional  algorithms  which  should be included in the calculations.  The
              Minimum free energy (MFE) and a structure  representative  are  calculated  in  any
              case.

       -z, --zscore[=DOUBLE]
              Limit the output to predictions with a Z-score below a threshold

              (default=`-2')

              This  option  activates  z-score  regression  using  a  trained  SVM. Any predicted
              structure that exceeds the specified threshold will be  ommited  from  the  output.
              Since  the  Z-score  threshold  is  given as a negative number, it must immediately
              preceed the short option to not be mistaken as a  separate  argument,  e.g.  -z-2.9
              sets the threshold to a value of -2.9

       --zscore-pre-filter
              Apply the z-score filtering in the forward recursions

              (default=off)

              The  default  mode  of  z-score  filtering  considers the entire structure space to
              decide whether or not a locally optimal structure at any position i is reported  or
              not.  When  using  this  post-filtering  step, however, alternative locally optimal
              structures

              starting at i with higher energy but lower z-score can be easily missed. The

              pre-filter

              option restricts the structure space already in the forward recursions, such

              that

              only optimal solution among those candidates that satisfy the z-score

              threshold  are  considered.  Therefore,  good  results  according  to  the  z-score
              threshold  criterion are less likely to be superseded by results with better energy
              but worse z-score. Note, that activating this switch results in higher  computation
              time which scales linear in the window length.

       --zscore-report-subsumed
              Report  subsumed  structures  if  their  z-score is less than that of the enclosing
              structure

              (default=off)

              In default mode, RNALfold only reports locally optimal structures if  they  are  no
              constituents  of  another, larger structure with less free energy. In z-score mode,
              however, such a larger structure may have  a  higher  z-score,  thus  may  be  less
              informative  than  the  smaller substructure. Using this switch activates reporting
              both, the smaller and the larger structure if the z-score of the smaller  is  lower
              than that of the larger.

       -b, --backtrack-global
              Backtrack a global MFE structure.  (default=off)

              Instead  of  just  reporting  the  locally  stable secondary structure a global MFE
              structure can be constructed that only consists of locally  optimal  substructures.
              This  switch  activates  a  post-processing  step  that  takes  the locally optimal
              structures to generate the global MFE structure which  constitutes  the  MFE  value
              reported  in  the  last  line.  The respective global MFE structure is printed just
              after the inut sequence part on the last line,  preceding  the  global  MFE  score.
              Note,  that  this  option implies -o/--outfile since the locally optimal structures
              must be read after the regular prediction step! Also note,  that  using  this  this
              option  in  combination  with  -z/--zscore  implies  --zscore-hard-filter to ensure
              proper construction of the global MFE structure!

       -g, --gquad
              Incoorporate G-Quadruplex formation into the structure prediction algorithm

              (default=off)

       --shape=<filename>
              Use SHAPE reactivity data to guide structure predictions.

       --shapeMethod=D/Z/W
              Include SHAPE reactivity data according to a particular method.

              (default=`D')

              The following methods can be used to convert SHAPE reactivities into pseudo  energy
              contributions.

              'D':  Convert  by  using  a  linear  equation  according  to Deigan et al 2009. The
              calculated pseudo energies will be applied  for  every  nucleotide  involved  in  a
              stacked pair. This method is recognized by a capital 'D' in the provided parameter,
              i.e.: --shapeMethod="D" is the default setting. The slope 'm' and the intercept 'b'
              can  be  set  to  a  non-default value if necessary, otherwise m=1.8 and b=-0.6. To
              alter these parameters, e.g. m=1.9 and b=-0.7, use a parameter  string  like  this:
              --shapeMethod="Dm1.9b-0.7".  You  may  also  provide only one of the two parameters
              like: --shapeMethod="Dm1.9" or --shapeMethod="Db-0.7".

              'Z': Convert SHAPE reactivities to pseudo energies according to Zarringhalam et  al
              2012. SHAPE reactivities will be converted to pairing probabilities by using linear
              mapping. Aberration from the  observed  pairing  probabilities  will  be  penalized
              during  the  folding  recursion.  The  magnitude  of  the penalties can affected by
              adjusting the factor beta (e.g. --shapeMethod="Zb0.8").

              'W': Apply  a  given  vector  of  perturbation  energies  to  unpaired  nucleotides
              according  to  Washietl et al 2012. Perturbation vectors can be calculated by using
              RNApvmin.

       --shapeConversion=type
              Convert SHAPE reactivity according to a particular model.

              (default=`O')

              This method allows one to specify  the  method  or  model  used  to  convert  SHAPE
              reactivities  to  pairing (or unpaired) probabilities when using the SHAPE approach
              of Zarringhalam et al. 2012. The following single letter types are recognized:

              'M': Use linear mapping according to Zarringhalam et al. 2012.

              'C': Use a cutoff-approach to divide into paired  and  unpaired  nucleotides  (e.g.
              "C0.25")

              'S':   Skip   the   normalizing  step  since  the  input  data  already  represents
              probabilities for being unpaired rather than raw reactivity values

              'L': Use a linear model to convert the reactivity  into  a  probability  for  being
              unpaired (e.g. "Ls0.68i0.2" to use a slope of 0.68 and an intercept of 0.2)

              'O': Use a linear model to convert the log of the reactivity into a probability for
              being unpaired (e.g. "Os1.6i-2.29" to use a slope of 1.6 and an intercept of -2.29)

   Model Details:
              You may tweak the energy model and pairing rules additionally using  the  following
              parameters

       -T, --temp=DOUBLE
              Rescale energy parameters to a temperature of temp C. Default is 37C.

       -4, --noTetra
              Do not include special tabulated stabilizing energies for tri-, tetra- and hexaloop
              hairpins.

              (default=off)

       -d, --dangles=INT
              Change the dangling end model (default=`2')

              This option allows one to change the model  "dangling  end"  energy  contributions,
              i.e. those additional contributions from bases adjacent to helices in free ends and
              multi-loops With -d1 only unpaired bases can participate in at  most  one  dangling
              end.  With -d2 this check is ignored, dangling energies will be added for the bases
              adjacent to a helix on both sides in any case; this is  the  default  for  mfe  and
              partition  function  folding (-p).  The option -d0 ignores dangling ends altogether
              (mostly  for debugging).  With -d3 mfe  folding  will  allow  coaxial  stacking  of
              adjacent  helices  in  multi-loops. At the moment the implementation will not allow
              coaxial stacking of the two interior pairs in a loop of degree 3 and works only for
              mfe folding.

              Note  that  with  -d1  and -d3 only the MFE computations will be using this setting
              while partition function uses -d2 setting,  i.e.  dangling  ends  will  be  treated
              differently.

       --noLP Produce structures without lonely pairs (helices of length 1).

              (default=off)

              For  partition  function  folding  this  only  disallows  pairs that can only occur
              isolated. Other pairs may still occasionally occur as helices of length 1.

       --noGU Do not allow GU pairs

              (default=off)

       --noClosingGU
              Do not allow GU pairs at the end of helices

              (default=off)

       -P, --paramFile=paramfile
              Read energy parameters from paramfile, instead of using the default parameter set.

              Different sets  of  energy  parameters  for  RNA  and  DNA  should  accompany  your
              distribution.   See  the  RNAlib documentation for details on the file format. When
              passing the placeholder file name "DNA", DNA parameters are loaded without the need
              to actually specify any input file.

       --nsp=STRING
              Allow other pairs in addition to the usual AU,GC,and GU pairs.

              Its  argument is a comma separated list of additionally allowed pairs. If the first
              character is a "-" then AB will imply that AB  and  BA  are  allowed  pairs.   e.g.
              RNALfold  -nsp  -GA   will  allow  GA  and  AG pairs. Nonstandard pairs are given 0
              stacking energy.

       -e, --energyModel=INT
              Rarely used option to fold sequences from the artificial ABCD... alphabet, where  A
              pairs B, C-D etc.  Use the energy parameters for GC (-e 1) or AU (-e 2) pairs.

REFERENCES

       If you use this program in your work you might want to cite:

       R.  Lorenz, S.H. Bernhart, C. Hoener zu Siederdissen, H. Tafer, C. Flamm, P.F. Stadler and
       I.L. Hofacker (2011), "ViennaRNA Package 2.0", Algorithms for Molecular Biology: 6:26

       I.L. Hofacker, W. Fontana, P.F. Stadler, S. Bonhoeffer, M.  Tacker,  P.  Schuster  (1994),
       "Fast  Folding and Comparison of RNA Secondary Structures", Monatshefte f. Chemie: 125, pp
       167-188

       R.  Lorenz,  I.L.  Hofacker,  P.F.  Stadler  (2016),  "RNA  folding  with  hard  and  soft
       constraints", Algorithms for Molecular Biology 11:1 pp 1-13

       I.L.  Hofacker,  B.  Priwitzer, and P.F. Stadler (2004), "Prediction of Locally Stable RNA
       Secondary Structures for Genome-Wide Surveys", Bioinformatics: 20, pp 186-190

       The energy parameters are taken from:

       D.H. Mathews, M.D. Disney, D. Matthew, J.L. Childs, S.J. Schroeder, J.  Susan,  M.  Zuker,
       D.H.  Turner  (2004),  "Incorporating  chemical  modification  constraints  into a dynamic
       programming algorithm for prediction of RNA secondary structure", Proc. Natl.  Acad.  Sci.
       USA: 101, pp 7287-7292

       D.H  Turner,  D.H.  Mathews  (2009),  "NNDB:  The  nearest neighbor parameter database for
       predicting stability of nucleic acid secondary structure", Nucleic Acids Research: 38,  pp
       280-282

AUTHOR

       Ivo L Hofacker, Peter F Stadler, Ronny Lorenz

REPORTING BUGS

       If  in  doubt  our  program  is  right,  nature  is  at fault.  Comments should be sent to
       rna@tbi.univie.ac.at.

SEE ALSO

       RNAplfold(1) RNALalifold(1)