Provided by: libchemistry-mol-perl_0.39-1_all bug

NAME

       Chemistry::File::Formula - Molecular formula reader/formatter

SYNOPSIS

           use Chemistry::File::Formula;

           my $mol = Chemistry::Mol->parse("H2O");
           print $mol->print(format => formula);
           print $mol->formula;    # this is a shorthand for the above
           print $mol->print(format => formula,
               formula_format => "%s%d{<sub>%d</sub>});

DESCRIPTION

       This module converts a molecule object to a string with the formula and back.  It
       registers the 'formula' format with Chemistry::Mol.  Besides its obvious use, it is
       included in the Chemistry::Mol distribution because it is a very simple example of a
       Chemistry::File derived I/O module.

   Writing formulas
       The format can be specified as a printf-like string with the following control sequences,
       which are specified with the formula_format parameter to $mol->print or $mol->write.

       %s  symbol
       %D  number of atoms
       %d  number of atoms, included only when it is greater than one
       %d{substr}  substr is only included when number of atoms is greater than one
       %j{substr}  substr is inserted between the formatted string for each element. (The 'j'
       stands for 'joiner'.) The format should have only one joiner, but its location in the
       format string doesn't matter.
       %% a percent sign

       If no format is specified, the default is "%s%d". Some examples follow. Let's assume that
       the formula is C2H6O, as it would be formatted by default.

       "%s%D"
           Like the default, but include explicit indices for all atoms.  The formula would be
           formatted as "C2H6O1"

       "%s%d{<sub>%d</sub>}"
           HTML format. The output would be "C<sub>2</sub>H<sub>6</sub>O".

       "%D %s%j{, }"
           Use a comma followed by a space as a joiner. The output would be "2 C, 6 H, 1 O".

       Symbol Sort Order

       The elements in the formula are sorted by default in the "Hill order", which means that:

       1) if the formula contains carbon, C goes first, followed by H, and the rest of the
       symbols in alphabetical order. For example, "CH2BrF".

       2) if there is no carbon, all the symbols (including H) are listed alphabetically.  For
       example, "BrH".

       It is possible to supply a custom sorting subroutine with the 'formula_sort' option. It
       expects a subroutine reference that takes a hash reference describing the formula (similar
       to what is returned by parse_formula, discussed below), and that returns a list of symbols
       in the desired order.

       For example, this will sort the symbols in reverse asciibetical order:

           my $formula = $mol->print(
               format          => 'formula',
               formula_sort    => sub {
                   my $formula_hash = shift;
                   return reverse sort keys %$formula_hash;
               }
           );

   Parsing Formulas
       Formulas can also be parsed back into Chemistry::Mol objects.  The formula may have
       parentheses and square or triangular brackets, and it may have the following
       abbreviations:

           Me => '(CH3)',
           Et => '(CH3CH2)',
           Bu => '(C4H9)',
           Bn => '(C6H5CH2)',
           Cp => '(C5H5)',
           Ph => '(C6H5)',
           Bz => '(C6H5CO)',

       The formula may also be preceded by a number, which multiplies the whole formula. Some
       examples of valid formulas:

           Formula              Equivalent to
           --------------------------------------------------------------
           CH3(CH2)3CH3         C5H12
           C6H3Me3              C9H12
           2Cu[NH3]4(NO3)2      Cu2H24N12O12
           2C(C[C<C>5]4)3       C152
           2C(C(C(C)5)4)3       C152
           C 1 0 H 2 2          C10H22 (whitespace is completely ignored)

       When a formula is parsed, a molecule object is created which consists of the set of the
       atoms in the formula (no bonds or coordinates, of course).  The atoms are created in
       alphabetical order, so the molecule object for C2H5Br would have the atoms in the
       following sequence: Br, C, C, H, H, H, H, H.

       If you don't want to create a molecule object, but would rather have a simple hash with
       the number of atoms for each element, use the "parse_formula" method:

           my %formula = Chemistry::File::Formula->parse_formula("C2H6O");
           use Data::Dumper;
           print Dumper \%formula;

       which prints something like

           $VAR1 = {
                     'H' => 6,
                     'O' => 1,
                     'C' => 2
                   };

       The "parse_formula" method is called internally by the "parse_string" method.

       Non-integer numbers in formulas

       The "parse_formula" method can also accept formulas that contain floating-point numbers,
       such as H1.5N0.5. The numbers must be positive, and numbers smaller than one should
       include a leading zero (e.g., 0.9, not .9).

       When formulas with non-integer numbers of atoms are turned into molecule objects as
       described in the previous section, the number of atoms is always rounded up. For example,
       H1.5N0.5 will produce a molecule object with two hydrogen atoms and one nitrogen atom.

       There is currently no way of producing formulas with non-integer numbers; perhaps a future
       version will include an "occupancy" property for atoms that will result in non-integer
       formulas.

SOURCE CODE REPOSITORY

       <https://github.com/perlmol/Chemistry-Mol>

SEE ALSO

       Chemistry::Mol, Chemistry::File

       For discussion about Hill order, just search the web for "formula "hill order"". The
       original reference is J. Am. Chem. Soc. 1900, 22, 478-494.
       <http://dx.doi.org/10.1021/ja02046a005>.

AUTHOR

       Ivan Tubert-Brohman <itub@cpan.org>.

       Formula parsing code contributed by Brent Gregersen.

       Patch for non-integer formulas by Daniel Scott.

COPYRIGHT

       Copyright (c) 2005 Ivan Tubert-Brohman. All rights reserved. This program is free
       software; you can redistribute it and/or modify it under the same terms as Perl itself.