Provided by: libchemistry-formula-perl_3.0.1-1.2_all bug

NAME

       Chemistry::Formula - Enumerate elements in a chemical formula

SYNOPSIS

          use Chemistry::Formula qw(parse_formula);
          parse_formula('Pb (H (TiO3)2 )2 U [(H2O)3]2', \%count);

       That is obviously not a real compound, but it demonstrates the capabilities of the
       routine.  This returns

         %count = (
                   'O' => 18,
                   'H' => 14,
                   'Ti' => 4,
                   'U' => 1,
                   'Pb' => 1
                  );

DESCRIPTION

       This module provides a function which parses a string containing a chemical formula and
       returns the number of each element in the string.  It can handle nested parentheses and
       square brackets and correctly computes stoichiometry given numbers outside the (possibly
       nested) parentheses.

       No effort is made to evaluate the chemical plausibility of the formula.  The example above
       parses just fine using this module, even though it is clearly not a viable compound.
       Charge balancing, bond valence, and so on is beyond the scope of this module.

       Only one function is exported, "parse_formula".  This takes a string and a hash reference
       as its arguments and returns 0 or 1.

           $ok = parse_formula('PbTiO3', \%count);

       If the formula was parsed without trouble, "parse_formula" returns 1. If there was any
       problem, it returns 0 and $count{error} is filled with a string describing the problem.
       It throws an error afer the first error encountered without testing the rest of the
       string.

       If the formula was parsed correctly, the %count hash contains element symbols as its keys
       and the number of each element as its values.

       Here is an example of a program that reads a string from the command line and, for the
       formula unit described in the string, writes the weight and absorption in barns.

           use Data::Dumper;
           use Xray::Absorption;
           use Chemistry::Formula qw(parse_formula);

           parse_formula($ARGV[0], \%count);

           print  Data::Dumper->Dump([\%count], [qw(*count)]);
           my ($weight, $barns) = (0,0);
           foreach my $k (keys(%$count)) {
             $weight +=
               Xray::Absorption -> get_atomic_weight($k) * $count{$k};
             $barns  +=
               Xray::Absorption -> cross_section($k, 9000) * $count{$k};
           };
           printf "This weighs %.3f amu and absorbs %.3f barns at 9 keV.\n",
             $weight, $barns;

       Pretty simple.

       The parser is not brilliant.  Here are the ground rules:

       1.  Element symbols must be first letter capitalized.

       2.  Whitespace is unimportant -- it will be removed from the string.  So will dollar
           signs, underscores, and curly braces (in an attempt to handle TeX).  Also a sequence
           like this: '/sub 3/' will be converted to '3' (in an attempt to handle INSPEC).

       3.  Numbers can be integers or floating point numbers.  Things like 5, 0.5, 12.87, and .5
           are all acceptible, as is exponential notation like 1e-2.  Note that exponential
           notation must use a leading number to avoid confusion with element symbols.  That is,
           1e-2 is ok, but e-2 is not.

       4.  Uncapitalized symbols or unrecognized symbols will flag an error.

       5.  An error will be flagged if the number of open parens is different from the number of
           close parens.

       6.  An error will be flagged if any unusual symbols are found in the string.

ACKNOWLEDGMENTS

       This was written at the suggestion of Matt Newville, who tested early versions.

       The routine "matchingbrace" was swiped from the C::Scan module, which can be found on
       CPAN.  C::Scan is maintained by Hugo van der Sanden.

AUTHOR

       Bruce Ravel <bravel AT bnl DOT gov>

       http://cars9.uchicago.edu/~ravel/software/

       SVN repository: http://cars9.uchicago.edu/svn/libperlxray/