Provided by: libchemistry-opensmiles-perl_0.8.5-1_all bug

NAME

       Chemistry::OpenSMILES - OpenSMILES format reader and writer

SYNOPSIS

           use Chemistry::OpenSMILES::Parser;

           my $parser = Chemistry::OpenSMILES::Parser->new;
           my @moieties = $parser->parse( 'C#C.c1ccccc1' );

           $\ = "\n";
           for my $moiety (@moieties) {
               #  $moiety is a Graph::Undirected object
               print scalar $moiety->vertices;
               print scalar $moiety->edges;
           }

           use Chemistry::OpenSMILES::Writer qw(write_SMILES);

           print write_SMILES( \@moieties );

DESCRIPTION

       Chemistry::OpenSMILES provides support for SMILES chemical identifiers conforming to
       OpenSMILES v1.0 specification (<http://opensmiles.org/opensmiles.html>).

       Chemistry::OpenSMILES::Parser reads in SMILES strings and returns them parsed to arrays of
       Graph::Undirected objects. Each atom is represented by a hash.

       Chemistry::OpenSMILES::Writer performs the inverse operation. Generated SMILES strings are
       by no means optimal.

   Molecular graph
       Disconnected parts of a compound are represented as separate Graph::Undirected objects.
       Atoms are represented as vertices, and bonds are represented as edges.

       Atoms

       Atoms, or vertices of a molecular graph, are represented as hash references:

           {
               "symbol"    => "C",
               "isotope"   => 13,
               "chirality" => "@@",
               "hcount"    => 3,
               "charge"    => 1,
               "class"     => 0,
               "number"    => 0,
           }

       Except for "symbol", "class" and "number", all keys of hash are optional. Per OpenSMILES
       specification, default values for "hcount" and "class" are 0.

       For chiral atoms, the order of its neighbours in input is preserved in an array added as
       value for "chirality_neighbours" key of the atom hash.

       Bonds

       Bonds, or edges of a molecular graph, rely completely on Graph::Undirected internal
       representation. Bond orders other than single ("-", which is also a default) are
       represented as values of edge attribute "bond". They correspond to the symbols used in
       OpenSMILES specification.

   Options
       "parse" accepts the following options for key-value pairs in an anonymous hash for its
       second parameter:

       "max_hydrogen_count_digits"
           In OpenSMILES specification the number of attached hydrogen atoms for atoms in square
           brackets is limited to 9. IUPAC SMILES+ has increased this number to 99. With the
           value of "max_hydrogen_count_digits" the parser could be instructed to allow other
           than 1 digit for attached hydrogen count.

       "raw"
           With "raw" set to anything evaluating to true, the parser will not convert neither
           implicit nor explicit hydrogen atoms in square brackets to atom hashes of their own.
           Moreover, it will not attempt to unify the representations of chirality. It should be
           noted, though, that many of subroutines of Chemistry::OpenSMILES expect non-raw data
           structures, thus processing raw output may produce distorted results.

CAVEATS

       Element symbols in square brackets are not limited to the ones known to chemistry.
       Currently any single or two-letter symbol is allowed.

       Deprecated charge notations ("--" and "++") are supported.

       OpenSMILES specification mandates a strict order of ring bonds and branches:

           branched_atom ::= atom ringbond* branch*

       Chemistry::OpenSMILES::Parser supports both the mandated, and inverted structure, where
       ring bonds follow branch descriptions.

       Whitespace is not supported yet. SMILES descriptors must be cleaned of it before
       attempting reading with Chemistry::OpenSMILES::Parser.

       The derivation of implicit hydrogen counts for aromatic atoms is not unambiguously defined
       in the OpenSMILES specification. Thus only aromatic carbon is accounted for as if having
       valence of 3.

       Chiral atoms with three neighbours are interpreted as having a lone pair of electrons as
       the fourth chiral neighbour. The lone pair is always understood as being the second in the
       order of neighbour enumeration, except when the atom with the lone pair starts a chain. In
       that case lone pair is the first.

SEE ALSO

       perl(1)

AUTHORS

       Andrius Merkys, <merkys@cpan.org>