Provided by: libset-intspan-perl_1.19-3_all bug

NAME

       Set::IntSpan - Manages sets of integers

SYNOPSIS

         # BEGIN { $Set::IntSpan::integer = 1 }
         use Set::IntSpan qw(grep_set map_set grep_spans map_spans);

         # $Set::IntSpan::Empty_String = '-';   # or '';

         $set    = new   Set::IntSpan $set_spec;
         $set    = new   Set::IntSpan @set_specs;
         $valid  = valid Set::IntSpan $run_list;
         $set    = copy  $set $set_spec;

         $run_list = run_list $set;
         @elements = elements $set;
         @sets     = sets     $set;
         @spans    = spans    $set;

         $u_set = union      $set $set_spec;
         $i_set = intersect  $set $set_spec;
         $x_set = xor        $set $set_spec;
         $d_set = diff       $set $set_spec;
         $c_set = complement $set;

         $set->U($set_spec);   # Union
         $set->I($set_spec);   # Intersect
         $set->X($set_spec);   # Xor
         $set->D($set_spec);   # Diff
         $set->C;              # Complement

         equal      $set $set_spec
         equivalent $set $set_spec
         superset   $set $set_spec
         subset     $set $set_spec

         $n = cardinality $set;
         $n = size        $set;

         empty      $set
         finite     $set
         neg_inf    $set
         pos_inf    $set
         infinite   $set
         universal  $set

         member     $set $n;
         insert     $set $n;
         remove     $set $n;

         $min = min $set;
         $max = max $set;

         $holes   = holes $set;
         $cover   = cover $set;
         $inset   = inset $set $n;
         $smaller = trim  $set $n;
         $bigger  = pad   $set $n;

         $subset  = grep_set   { ... } $set;
         $mapset  = map_set    { ... } $set;

         $subset  = grep_spans { ... } $set;
         $mapset  = map_spans  { ... } $set;

         for ($element=$set->first; defined $element; $element=$set->next) { ... }
         for ($element=$set->last ; defined $element; $element=$set->prev) { ... }

         $element = $set->start($n);
         $element = $set->current;

         $n       = $set->at($i);
         $slice   = $set->slice($from, $to);
         $i       = $set->ord($n);
         $i       = $set->span_ord($n);

   Operator overloads
         $u_set =  $set + $set_spec;   # union
         $i_set =  $set * $set_spec;   # intersect
         $x_set =  $set ^ $set_spec;   # xor
         $d_set =  $set - $set_spec;   # diff
         $c_set = ~$set;               # complement

         $set += $set_spec;            # union
         $set *= $set_spec;            # intersect
         $set ^= $set_spec;            # xor
         $set -= $set_spec;            # diff

         $set eq $set_spec             # equal
         $set ne $set_spec             # not equal
         $set le $set_spec             # subset
         $set lt $set_spec             # proper subset
         $set ge $set_spec             # superset
         $set gt $set_spec             # proper superset

         # compare sets by cardinality
         $set1 ==  $set2
         $set1 !=  $set2
         $set1 <=  $set2
         $set1 <   $set2
         $set1 >=  $set2
         $set1 >   $set2
         $set1 <=> $set2

         # compare cardinality of set to an integer
         $set1 ==  $n
         $set1 !=  $n
         $set1 <=  $n
         $set1 <   $n
         $set1 >=  $n
         $set1 >   $n
         $set1 <=> $n

         @sorted = sort @sets;         # sort sets by cardinality

         if ($set) { ... }             # true if $set is not empty

         print "$set\n";               # stringizes to the run list

EXPORTS

   @EXPORT
       Nothing

   @EXPORT_OK
       "grep_set", "map_set", "grep_spans", "map_spans"

DESCRIPTION

       "Set::IntSpan" manages sets of integers.  It is optimized for sets that have long runs of
       consecutive integers.  These arise, for example, in .newsrc files, which maintain lists of
       articles:

         alt.foo: 1-21,28,31
         alt.bar: 1-14192,14194,14196-14221

       A run of consecutive integers is also called a span.

       Sets are stored internally in a run-length coded form.  This provides for both compact
       storage and efficient computation.  In particular, set operations can be performed
       directly on the encoded representation.

       "Set::IntSpan" is designed to manage finite sets.  However, it can also represent some
       simple infinite sets, such as { x | x>n }.  This allows operations involving complements
       to be carried out consistently, without having to worry about the actual value of INT_MAX
       on your machine.

SPANS

       A span is a run of consecutive integers.  A span may be represented by an array reference,
       in any of 5 forms:

   Finite forms
           Span                Set
         [ $n,    $n    ]      { n }
         [ $a,    $b    ]      { x | a<=x && x<=b}

   Infinite forms
           Span                Set
         [ undef, $b    ]      { x | x<=b }
         [ $a   , undef ]      { x | x>=a }
         [ undef, undef ]      The set of all integers

       Some methods operate directly on spans.

SET SPECIFICATIONS

       Many of the methods take a set specification.  There are four kinds of set specifications.

   Empty
       If a set specification is omitted, then the empty set is assumed.  Thus,

         $set = new Set::IntSpan;

       creates a new, empty set.  Similarly,

         copy $set;

       removes all elements from $set.

   Object reference
       If an object reference is given, it is taken to be a "Set::IntSpan" object.

   Run list
       If a string is given, it is taken to be a run list.  A run list specifies a set using a
       syntax similar to that in newsrc files.

       A run list is a comma-separated list of runs.  Each run specifies a set of consecutive
       integers.  The set is the union of all the runs.

       Runs may be written in any of 5 forms.

       Finite forms

       n       { n }

       a-b     { x | a<=x && x<=b }

       Infinite forms

       (-n     { x | x<=n }

       n-)     { x | x>=n }

       (-)     The set of all integers

       Empty forms

       The empty set is consistently written as '' (the null string).  It is also denoted by the
       special form '-' (a single dash).

       Restrictions

       The runs in a run list must be disjoint, and must be listed in increasing order.

       Valid characters in a run list are 0-9, '(', ')', '-' and ','.  White space and underscore
       (_) are ignored.  Other characters are not allowed.

       Examples

         Run list          Set
         "-"               { }
         "1"               { 1 }
         "1-2"             { 1, 2 }
         "-5--1"           { -5, -4, -3, -2, -1 }
         "(-)"             the integers
         "(--1"            the negative integers
         "1-3, 4, 18-21"   { 1, 2, 3, 4, 18, 19, 20, 21 }

   Array reference
       If an array reference is given, then the elements of the array specify the elements of the
       set.  The array may contain

       •   integers

       •   spans

       The set is the union of all the integers and spans in the array.  The integers and spans
       need not be disjoint.  The integers and spans may be in any order.

       Examples

         Array ref                         Set
         [ ]                               { }
         [ 1, 1 ]                          { 1 }
         [ 1, 3, 2 ]                       { 1, 2, 3 }
         [ 1, [ 5, 8 ], 5, [ 7, 9 ], 2 ]   { 1, 2, 5, 6, 7, 8, 9 }
         [ undef, undef ]                  the integers
         [ undef, -1 ]                     the negative integers

ITERATORS

       Each set has a single iterator, which is shared by all calls to "first", "last", "start",
       "next", "prev", and "current".  At all times, the iterator is either an element of the
       set, or "undef".

       "first", "last", and "start" set the iterator; "next", and "prev" move it; and "current"
       returns it.  Calls to these methods may be freely intermixed.

       Using "next" and "prev", a single loop can move both forwards and backwards through a set.
       Using "start", a loop can iterate over portions of an infinite set.

METHODS

   Creation
       $set = "new" "Set::IntSpan" $set_spec
       $set = "new" "Set::IntSpan" @set_specs
           Creates and returns a "Set::IntSpan" object.

           The initial contents of the set are given by $set_spec, or by the union of all the
           @set_specs.

       $ok = "valid" "Set::IntSpan" $run_list
           Returns true if $run_list is a valid run list.  Otherwise, returns false and leaves an
           error message in $@.

       $set = "copy" $set $set_spec
           Copies $set_spec into $set.  The previous contents of $set are lost.  For convenience,
           "copy" returns $set.

       $run_list = "run_list" $set
           Returns a run list that represents $set.  The run list will not contain white space.
           $set is not affected.

           By default, the empty set is formatted as '-'; a different string may be specified in
           $Set::IntSpan::Empty_String.

       @elements = "elements" $set
           Returns an array containing the elements of $set.  The elements will be sorted in
           numerical order.  In scalar context, returns an array reference.  $set is not
           affected.

       @sets = "sets" $set
           Returns the runs in $set, as a list of "Set::IntSpan" objects.  The sets in the list
           are in order.

       @spans = "spans" $set
           Returns the runs in $set, as a list of the form

             ([$a1, $b1],
              [$a2, $b2],
              ...
              [$aN, $bN])

           If a run contains only a single integer, then the upper and lower bounds of the
           corresponding span will be equal.

           If the set has no lower bound, then $a1 will be "undef".  Similarly, if the set has no
           upper bound, then $bN will be "undef".

           The runs in the list are in order.

   Set operations
       For these operations, a new "Set::IntSpan" object is created and returned.  The operands
       are not affected.

       $u_set = "union" $set $set_spec
           Returns the set of integers in either $set or $set_spec.

       $i_set = "intersect" $set $set_spec
           Returns the set of integers in both $set and $set_spec.

       $x_set = "xor" $set $set_spec
           Returns the set of integers in $set or $set_spec, but not both.

       $d_set = "diff" $set $set_spec
           Returns the set of integers in $set but not in $set_spec.

       $c_set = "complement" $set
           Returns the set of integers that are not in $set.

   Mutators
       By popular demand, "Set::IntSpan" now has mutating forms of the binary set operations.
       These methods alter the object on which they are called.

       $set->"U"($set_spec)
           Makes $set the union of $set and $set_spec.  Returns $set.

       $set->"I"($set_spec)
           Makes $set the intersection of $set and $set_spec.  Returns $set.

       $set->"X"($set_spec)
           Makes $set the symmetric difference of $set and $set_spec.  Returns $set.

       $set->"D"($set_spec)
           Makes $set the difference of $set and $set_spec.  Returns $set.

       $set->"C"
           Converts $set to its own complement.  Returns $set.

   Comparison
       "equal" $set $set_spec
           Returns true iff $set and $set_spec contain the same elements.

       "equivalent" $set $set_spec
           Returns true iff $set and $set_spec contain the same number of elements.  All infinite
           sets are equivalent.

       "superset" $set $set_spec
           Returns true iff $set is a superset of $set_spec.

       "subset" $set $set_spec
           Returns true iff $set is a subset of $set_spec.

   Cardinality
       $n = "cardinality" $set
       $n = "size" $set
           Returns the number of elements in $set.  Returns -1 for infinite sets.  "size" is
           provided as an alias for "cardinality".

       "empty" $set
           Returns true iff $set is empty.

       "finite" $set
           Returns true iff $set is finite.

       "neg_inf" $set
           Returns true iff $set contains {x | x<n} for some n.

       "pos_inf" $set
           Returns true iff $set contains {x | x>n} for some n.

       "infinite" $set
           Returns true iff $set is infinite.

       "universal" $set
           Returns true iff $set contains all integers.

   Membership
       "member" $set $n
           Returns true iff the integer $n is a member of $set.

       "insert" $set $n
           Inserts the integer $n into $set.  Does nothing if $n is already a member of $set.

       "remove" $set $n
           Removes the integer $n from $set.  Does nothing if $n is not a member of $set.

   Extrema
       "min" $set
           Returns the smallest element of $set, or "undef" if there is none.

       "max" $set
           Returns the largest element of $set, or "undef" if there is none.

   Spans
       $holes = "holes" $set
           Returns a set containing all the holes in $set, that is, all the integers that are in-
           between spans of $set.

           "holes" is always a finite set.

       $cover = "cover" $set
           Returns a set consisting of a single span from $set->"min" to $set->"max". This is the
           same as

             union $set $set->holes

       $inset = "inset" $set $n
       $smaller = "trim" $set $n
       $bigger = "pad" $set $n
           "inset" returns a set constructed by removing $n integers from each end of each span
           of $set. If $n is negative, then -$n integers are added to each end of each span.

           In the first case, spans may vanish from the set; in the second case, holes may
           vanish.

           "trim" is provided as a synonym for "inset".

           "pad" $set $n is the same as "inset" $set -$n.

   Iterators
       $set->"first"
           Sets the iterator for $set to the smallest element of $set.  If there is no smallest
           element, sets the iterator to "undef".  Returns the iterator.

       $set->"last"
           Sets the iterator for $set to the largest element of $set.  If there is no largest
           element, sets the iterator to "undef".  Returns the iterator.

       $set->"start"($n)
           Sets the iterator for $set to $n.  If $n is not an element of $set, sets the iterator
           to "undef".  Returns the iterator.

       $set->"next"
           Sets the iterator for $set to the next element of $set.  If there is no next element,
           sets the iterator to "undef".  Returns the iterator.

           "next" will return "undef" only once; the next call to "next" will reset the iterator
           to the smallest element of $set.

       $set->"prev"
           Sets the iterator for $set to the previous element of $set.  If there is no previous
           element, sets the iterator to "undef".  Returns the iterator.

           "prev" will return "undef" only once; the next call to "prev" will reset the iterator
           to the largest element of $set.

       $set->"current"
           Returns the iterator for $set.

   Indexing
       The elements of a set are kept in numerical order.  These methods index into the set based
       on this ordering.

       $n = $set->"at"($i)
           Returns the $ith element of $set, or "undef" if there is no $ith element.  Negative
           indices count backwards from the end of the set.

           Dies if

           •   $i is non-negative and $set is "neg_inf"

           •   $i is negative and $set is "pos_inf"

       $slice = $set->"slice"($from, $to)
           Returns a "Set::IntSpan" object containing the elements of $set at indices $from..$to.
           Negative indices count backwards from the end of the set.

           Dies if

           •   $from is non-negative and $set is "neg_inf"

           •   $from is negative and $set is "pos_inf"

       $i = $set->"ord"($n)
           The inverse of "at".

           Returns the index $i of the integer $n in $set, or "undef" if $n if not an element of
           $set.

           Dies if $set is "neg_inf".

       $i = $set->"span_ord"($n)
           Returns the index $i of the span containing the integer $n, or "undef" if $n if not an
           element of $set.

           To recover the span containing $n, write

             ($set->spans)[$i]

OPERATOR OVERLOADS

       For convenience, some operators are overloaded on "Set::IntSpan" objects.

   set operations
       One operand must be a "Set::IntSpan" object.  The other operand may be a "Set::IntSpan"
       object or a set specification.

         $u_set =  $set + $set_spec;   # union
         $i_set =  $set * $set_spec;   # intersect
         $x_set =  $set ^ $set_spec;   # xor
         $d_set =  $set - $set_spec;   # diff
         $c_set = ~$set;               # complement

         $set += $set_spec;            # union
         $set *= $set_spec;            # intersect
         $set ^= $set_spec;            # xor
         $set -= $set_spec;            # diff

   equality
       The string comparison operations are overloaded to compare sets for equality and
       containment.  One operand must be a "Set::IntSpan" object.  The other operand may be a
       "Set::IntSpan" object or a set specification.

         $set eq $set_spec             # equal
         $set ne $set_spec             # not equal
         $set le $set_spec             # subset
         $set lt $set_spec             # proper subset
         $set ge $set_spec             # superset
         $set gt $set_spec             # proper superset

   equivalence
       The numerical comparison operations are overloaded to compare sets by cardinality.  One
       operand must be a "Set::IntSpan" object.  The other operand may be a "Set::IntSpan" object
       or an integer.

         $set1 ==  $set2
         $set1 !=  $set2
         $set1 <=  $set2
         $set1 <   $set2
         $set1 >=  $set2
         $set1 >   $set2
         $set1 <=> $set2
         $set1 cmp $set2

         $set1 ==  $n
         $set1 !=  $n
         $set1 <=  $n
         $set1 <   $n
         $set1 >=  $n
         $set1 >   $n
         $set1 <=> $n
         $set1 cmp $n

       N.B. The "cmp" operator is overloaded to compare sets by cardinality, not containment.
       This is done so that

         sort @sets

       will sort a list of sets by cardinality.

   conversion
       In boolean context, a "Set::IntSpan" object evaluates to true if it is not empty.

       A "Set::IntSpan" object stringizes to its run list.

FUNCTIONS

       $sub_set = "grep_set" { ... } $set
           Evaluates the BLOCK for each integer in $set (locally setting $_ to each integer) and
           returns a "Set::IntSpan" object containing those integers for which the BLOCK returns
           TRUE.

           Returns "undef" if $set is infinite.

       $map_set = "map_set" { ... } $set
           Evaluates the BLOCK for each integer in $set (locally setting $_ to each integer) and
           returns a "Set::IntSpan" object containing all the integers returned as results of all
           those evaluations.

           Evaluates the BLOCK in list context, so each element of $set may produce zero, one, or
           more elements in the returned set.  The elements may be returned in any order, and
           need not be disjoint.

           Returns "undef" if $set is infinite.

       $sub_set = "grep_spans" { ... } $set
           Evaluates the BLOCK for each span in $set and returns a "Set::IntSpan" object
           containing those spans for which the BLOCK returns TRUE.

           Within BLOCK, $_ is locally set to an array ref of the form

             [ $lower, $upper ]

           where $lower and $upper are the bounds of the span.  If the span contains only one
           integer, then $lower and $upper will be equal.  If the span is unbounded, then the
           corresponding element(s) of the array will be "undef".

       $map_set = "map_spans" { ... } $set
           Evaluates the BLOCK for each span in $set, and returns a "Set::IntSpan" object
           consisting of the union of all the spans returned as results of all those evaluations.

           Within BLOCK, $_ is locally set to an array ref of the form

             [ $lower, $upper ]

           as described above for "grep_spans".  Each evaluation of BLOCK must return a list of
           spans.  Each returned list may contain zero, one, or more spans.  Spans may be
           returned in any order, and need not be disjoint.  However, for each bounded span, the
           constraint

             $lower <= $upper

           must hold.

CLASS VARIABLES

       $Set::IntSpan::Empty_String
           $Set::IntSpan::Empty_String contains the string that is returned when "run_list" is
           called on the empty set.  $Empty_String is initially '-'; alternatively, it may be set
           to ''.  Other values should be avoided, to ensure that "run_list" always returns a
           valid run list.

           "run_list" accesses $Empty_String through a reference stored in
           $set->{"empty_string"}.  Subclasses that wish to override the value of $Empty_String
           can reassign this reference.

       $Set::IntSpan::integer
           Up until version 1.16, "Set::IntSpan" specified "use integer", because they were sets
           of...you know...integers.  As of 2012, users are reporting newsgroups with article
           numbers above 0x7fffffff, which break "Set::IntSpan" on 32-bit processors.

           Version 1.17 removes "use integer" by default.  This extends the usable range of
           "Set::IntSpan" to the number of bits in the mantissa of your floating point
           representation.  For IEEE 754 doubles, this is 53 bits, or around 9e15.

           I benchmarked "Set::IntSpan" on a Pentium 4, and it looks like "use integer" provides
           a 2% to 4% speedup, depending on the application.

           If you want "use integer" back, either for performance, or because you are somehow
           dependent on its semantics, write

             BEGIN { $Set::IntSpan::integer = 1 }
             use Set::IntSpan;

DIAGNOSTICS

       Any method (except "valid") will "die" if it is passed an invalid run list.

       "Set::IntSpan::_copy_run_list: Bad syntax:" $runList
           (F) $run_list has bad syntax

       "Set::IntSpan::_copy_run_list: Bad order:" $runList
           (F) $run_list has overlapping runs or runs that are out of order.

       "Set::IntSpan::elements: infinite set"
           (F) An infinite set was passed to "elements".

       "Set::IntSpan::at: negative infinite set"
           (F) "at" was called with a non-negative index on a negative infinite set.

       "Set::IntSpan::at: positive infinite set"
           (F) "at" was called with a negative index on a positive infinite set.

       "Set::IntSpan::slice: negative infinite set"
           (F) "slice" was called with $from non-negative on a negative infinite set.

       "Set::IntSpan::slice: positive infinite set"
           (F) "slice" was called with $from negative on a positive infinite set.

       "Set::IntSpan::ord: negative infinite set"
           (F) "ord" was called on a negative infinite set.

       Out of memory!
           (X) "elements" $set can generate an "Out of memory!"  message on sufficiently large
           finite sets.

NOTES

   Traps
       Beware of forms like

         union $set [1..5];

       This passes an element of @set to union, which is probably not what you want.  To force
       interpretation of $set and [1..5] as separate arguments, use forms like

           union $set +[1..5];

       or

           $set->union([1..5]);

   grep_set and map_set
       "grep_set" and "map_set" make it easy to construct sets for which the internal
       representation used by "Set::IntSpan" is not small. Consider:

         $billion = new Set::IntSpan '0-1_000_000_000';   # OK
         $odd     = grep_set { $_ & 1 } $billion;         # trouble
         $even    = map_set  { $_ * 2 } $billion;         # double trouble

   Error handling
       There are two common approaches to error handling: exceptions and return codes.  There
       seems to be some religion on the topic, so "Set::IntSpan" provides support for both.

       To catch exceptions, protect method calls with an eval:

           $run_list = <STDIN>;
           eval { $set = new Set::IntSpan $run_list };
           $@ and print "$@: try again\n";

       To check return codes, use an appropriate method call to validate arguments:

           $run_list = <STDIN>;
           if (valid Set::IntSpan $run_list)
              { $set = new Set::IntSpan $run_list }
           else
              { print "$@ try again\n" }

       Similarly, use "finite" to protect calls to "elements":

           finite $set and @elements = elements $set;

       Calling "elements" on a large, finite set can generate an "Out of memory!" message, which
       cannot (easily) be trapped.  Applications that must retain control after an error can use
       "intersect" to protect calls to "elements":

           @elements = elements { intersect $set "-1_000_000 - 1_000_000" };

       or check the size of $set first:

           finite $set and cardinality $set < 2_000_000 and @elements = elements $set;

   Limitations
       Although "Set::IntSpan" can represent some infinite sets, it does not perform infinite-
       precision arithmetic.  Therefore, finite elements are restricted to the range of integers
       on your machine.

   Extensions
       Users report that you can construct Set::IntSpan objects on anything that behaves like an
       integer. For example:

           $x   = new Math::BigInt ...;
           $set = new Set::Intspan [ [ $x, $x+5 ] ];

       I'm not documenting this as supported behavior, because I don't have the resources to test
       it, but I'll try not to break it.  If anyone finds problems with it, let me know.

   Roots
       The sets implemented here are based on a Macintosh data structure called a region. See
       Inside Macintosh for more information.

       "Set::IntSpan" was originally written to manage run lists for the "News::Newsrc" module.

AUTHOR

       Steven McDougall <swmcd@world.std.com>

ACKNOWLEDGMENTS

       •   Malcolm Cook <mec@stowers-institute.org>

       •   David Hawthorne <dsrthorne@hotmail.com>

       •   Martin Krzywinski <martink@bcgsc.ca>

       •   Marc Lehmann <schmorp@schmorp.de>

       •   Andrew Olson <aolson@me.com>

       •   Nicholas Redgrave <baron@bologrew.net>

COPYRIGHT

       Copyright (c) 1996-2013 by Steven McDougall. This module is free software; you can
       redistribute it and/or modify it under the same terms as Perl itself.