Provided by: tcllib_1.19-dfsg-2_all bug

NAME

       struct::list - Procedures for manipulating lists

SYNOPSIS

       package require Tcl  8.4

       package require struct::list  ?1.8.3?

       ::struct::list longestCommonSubsequence sequence1 sequence2 ?maxOccurs?

       ::struct::list longestCommonSubsequence2 sequence1 sequence2 ?maxOccurs?

       ::struct::list lcsInvert lcsData len1 len2

       ::struct::list lcsInvert2 lcs1 lcs2 len1 len2

       ::struct::list lcsInvertMerge lcsData len1 len2

       ::struct::list lcsInvertMerge2 lcs1 lcs2 len1 len2

       ::struct::list reverse sequence

       ::struct::list shuffle list

       ::struct::list assign sequence varname ?varname?...

       ::struct::list flatten ?-full? ?--? sequence

       ::struct::list map sequence cmdprefix

       ::struct::list mapfor var sequence script

       ::struct::list filter sequence cmdprefix

       ::struct::list filterfor var sequence expr

       ::struct::list split sequence cmdprefix ?passVar failVar?

       ::struct::list fold sequence initialvalue cmdprefix

       ::struct::list shift listvar

       ::struct::list iota n

       ::struct::list equal a b

       ::struct::list repeat size element1 ?element2 element3...?

       ::struct::list repeatn value size...

       ::struct::list dbJoin ?-inner|-left|-right|-full? ?-keys varname? {keycol table}...

       ::struct::list dbJoinKeyed ?-inner|-left|-right|-full? ?-keys varname? table...

       ::struct::list swap listvar i j

       ::struct::list firstperm list

       ::struct::list nextperm perm

       ::struct::list permutations list

       ::struct::list foreachperm var list body

________________________________________________________________________________________________________________

DESCRIPTION

       The  ::struct::list  namespace  contains  several  useful  commands  for  processing Tcl lists. Generally
       speaking, they implement algorithms more complex or specialized than the ones provided by Tcl itself.

       It exports only a single command, struct::list. All functionality provided here can be reached through  a
       subcommand of this command.

COMMANDS

       ::struct::list longestCommonSubsequence sequence1 sequence2 ?maxOccurs?
              Returns  the  longest  common subsequence of elements in the two lists sequence1 and sequence2. If
              the maxOccurs parameter is provided, the common subsequence is restricted to elements  that  occur
              no more than maxOccurs times in sequence2.

              The  return  value  is  a  list of two lists of equal length. The first sublist is of indices into
              sequence1, and the second sublist is of  indices  into  sequence2.   Each  corresponding  pair  of
              indices  corresponds  to  equal  elements  in  the sequences; the sequence returned is the longest
              possible.

       ::struct::list longestCommonSubsequence2 sequence1 sequence2 ?maxOccurs?
              Returns an approximation to the longest common sequence of elements in the two lists sequence1 and
              sequence2.  If the maxOccurs parameter is omitted, the subsequence computed is exactly the longest
              common subsequence; otherwise, the longest common subsequence is approximated by first determining
              the  longest  common  sequence  of  only those elements that occur no more than maxOccurs times in
              sequence2, and then using that result to align the  two  lists,  determining  the  longest  common
              subsequences of the sublists between the two elements.

              As  with  longestCommonSubsequence,  the return value is a list of two lists of equal length.  The
              first sublist is of indices into sequence1, and the second sublist is of indices  into  sequence2.
              Each  corresponding  pair of indices corresponds to equal elements in the sequences.  The sequence
              approximates the longest common subsequence.

       ::struct::list lcsInvert lcsData len1 len2
              This command takes a description of a  longest  common  subsequence  (lcsData),  inverts  it,  and
              returns  the  result.  Inversion  means  here  that  as the input describes which parts of the two
              sequences are identical the output describes the differences instead.

              To be fully defined the lengths of the two sequences have to be known and  are  specified  through
              len1 and len2.

              The  result  is  a  list where each element describes one chunk of the differences between the two
              sequences. This description is a list containing three elements, a type and two pairs  of  indices
              into sequence1 and sequence2 respectively, in this order.  The type can be one of three values:

              added  Describes an addition. I.e. items which are missing in sequence1 can be found in sequence2.
                     The pair of indices into sequence1 describes where the added range had been expected to  be
                     in  sequence1.  The  first  index  refers  to the item just before the added range, and the
                     second index refers to the item just after the added  range.   The  pair  of  indices  into
                     sequence2  describes  the range of items which has been added to it. The first index refers
                     to the first item in the range, and the second index refers to the last item in the range.

              deleted
                     Describes a deletion. I.e. items which are in sequence1 are missing  from  sequence2.   The
                     pair  of  indices  into  sequence1 describes the range of items which has been deleted. The
                     first index refers to the first item in the range, and the second index refers to the  last
                     item  in  the  range.  The pair of indices into sequence2 describes where the deleted range
                     had been expected to be in sequence2. The first index refers to the item  just  before  the
                     deleted range, and the second index refers to the item just after the deleted range.

              changed
                     Describes  a  general  change.  I.e  a  range  of items in sequence1 has been replaced by a
                     different range of items in sequence2.  The pair of indices into  sequence1  describes  the
                     range  of  items  which  has been replaced. The first index refers to the first item in the
                     range, and the second index refers to the last item in the range.  The pair of indices into
                     sequence2  describes the range of items replacing the original range. Again the first index
                     refers to the first item in the range, and the second index refers to the last item in  the
                     range.

                  sequence 1 = {a b r a c a d a b r a}
                  lcs 1      =   {1 2   4 5     8 9 10}
                  lcs 2      =   {0 1   3 4     5 6 7}
                  sequence 2 =   {b r i c a     b r a c}

                  Inversion  = {{deleted  {0  0} {-1 0}}
                                {changed  {3  3}  {2 2}}
                                {deleted  {6  7}  {4 5}}
                                {added   {10 11}  {8 8}}}

              Notes:

              •      An  index  of  -1  in a deleted chunk refers to just before the first element of the second
                     sequence.

              •      Also an index equal to the length of the first sequence in an added chunk  refers  to  just
                     behind the end of the sequence.

       ::struct::list lcsInvert2 lcs1 lcs2 len1 len2
              Similar  to lcsInvert. Instead of directly taking the result of a call to longestCommonSubsequence
              this subcommand expects the indices for the two sequences in two separate lists.

       ::struct::list lcsInvertMerge lcsData len1 len2
              Similar to lcsInvert. It returns essentially the same structure as that command,  except  that  it
              may contain chunks of type unchanged too.

              These new chunks describe the parts which are unchanged between the two sequences. This means that
              the result of this command describes both the changed and unchanged parts of the two sequences  in
              one structure.

                  sequence 1 = {a b r a c a d a b r a}
                  lcs 1      =   {1 2   4 5     8 9 10}
                  lcs 2      =   {0 1   3 4     5 6 7}
                  sequence 2 =   {b r i c a     b r a c}

                  Inversion/Merge  = {{deleted   {0  0} {-1 0}}
                                      {unchanged {1  2}  {0 1}}
                                      {changed   {3  3}  {2 2}}
                                      {unchanged {4  5}  {3 4}}
                                      {deleted   {6  7}  {4 5}}
                                      {unchanged {8 10}  {5 7}}
                                      {added    {10 11}  {8 8}}}

       ::struct::list lcsInvertMerge2 lcs1 lcs2 len1 len2
              Similar   to   lcsInvertMerge.   Instead   of   directly   taking   the   result   of  a  call  to
              longestCommonSubsequence this subcommand expects the indices for the two sequences in two separate
              lists.

       ::struct::list reverse sequence
              The  subcommand  takes  a  single  sequence  as argument and returns a new sequence containing the
              elements of the input sequence in reverse order.

       ::struct::list shuffle list
              The subcommand takes a list and returns a copy of that list  with  the  elements  it  contains  in
              random  order.  Every possible ordering of elements is equally likely to be generated. The Fisher-
              Yates shuffling algorithm is used internally.

       ::struct::list assign sequence varname ?varname?...
              The subcommand assigns the first n elements of the input sequence to the  one  or  more  variables
              whose names were listed after the sequence, where n is the number of specified variables.

              If  there  are  more  variables specified than there are elements in the sequence the empty string
              will be assigned to the superfluous variables.

              If there are more elements in the sequence than variable names specified the subcommand returns  a
              list containing the unassigned elements. Else an empty list is returned.

                  tclsh> ::struct::list assign {a b c d e} foo bar
                  c d e
                  tclsh> set foo
                  a
                  tclsh> set bar
                  b

       ::struct::list flatten ?-full? ?--? sequence
              The  subcommand  takes a single sequence and returns a new sequence where one level of nesting was
              removed from the input sequence. In other words, the sublists in the input sequence  are  replaced
              by their elements.

              The subcommand will remove any nesting it finds if the option -full is specified.

                  tclsh> ::struct::list flatten {1 2 3 {4 5} {6 7} {{8 9}} 10}
                  1 2 3 4 5 6 7 {8 9} 10
                  tclsh> ::struct::list flatten -full {1 2 3 {4 5} {6 7} {{8 9}} 10}
                  1 2 3 4 5 6 7 8 9 10

       ::struct::list map sequence cmdprefix
              The  subcommand  takes  a  sequence  to  operate on and a command prefix (cmdprefix) specifying an
              operation, applies the command prefix to each element of  the  sequence  and  returns  a  sequence
              consisting of the results of that application.

              The command prefix will be evaluated with a single word appended to it. The evaluation takes place
              in the context of the caller of the subcommand.

                  tclsh> # squaring all elements in a list

                  tclsh> proc sqr {x} {expr {$x*$x}}
                  tclsh> ::struct::list map {1 2 3 4 5} sqr
                  1 4 9 16 25

                  tclsh> # Retrieving the second column from a matrix
                  tclsh> # given as list of lists.

                  tclsh> proc projection {n list} {::lindex $list $n}
                  tclsh> ::struct::list map {{a b c} {1 2 3} {d f g}} {projection 1}
                  b 2 f

       ::struct::list mapfor var sequence script
              The subcommand takes a sequence to operate on and a tcl script, applies the script to each element
              of the sequence and returns a sequence consisting of the results of that application.

              The  script  will  be  evaluated  as  is,  and  has access to the current list element through the
              specified iteration variable var. The evaluation takes place in the context of the caller  of  the
              subcommand.

                  tclsh> # squaring all elements in a list

                  tclsh> ::struct::list mapfor x {1 2 3 4 5} {
                expr {$x * $x}
                  }
                  1 4 9 16 25

                  tclsh> # Retrieving the second column from a matrix
                  tclsh> # given as list of lists.

                  tclsh> ::struct::list mapfor x {{a b c} {1 2 3} {d f g}} {
                lindex $x 1
                  }
                  b 2 f

       ::struct::list filter sequence cmdprefix
              The  subcommand  takes  a  sequence  to  operate on and a command prefix (cmdprefix) specifying an
              operation, applies the command prefix to each element of  the  sequence  and  returns  a  sequence
              consisting  of  all elements of the sequence for which the command prefix returned true.  In other
              words, this command filters out all elements of  the  input  sequence  which  fail  the  test  the
              cmdprefix represents, and returns the remaining elements.

              The command prefix will be evaluated with a single word appended to it. The evaluation takes place
              in the context of the caller of the subcommand.

                  tclsh> # removing all odd numbers from the input

                  tclsh> proc even {x} {expr {($x % 2) == 0}}
                  tclsh> ::struct::list filter {1 2 3 4 5} even
                  2 4

       Note: The filter is a specialized application of fold where the result is extended with the current  item
       or not, depending o nthe result of the test.

       ::struct::list filterfor var sequence expr
              The  subcommand takes a sequence to operate on and a tcl expression (expr) specifying a condition,
              applies the conditionto each element of the sequence and returns  a  sequence  consisting  of  all
              elements  of  the  sequence  for which the expression returned true.  In other words, this command
              filters out all elements of the input sequence which fail the test the condition expr  represents,
              and returns the remaining elements.

              The  expression  will  be  evaluated as is, and has access to the current list element through the
              specified iteration variable var. The evaluation takes place in the context of the caller  of  the
              subcommand.

                  tclsh> # removing all odd numbers from the input

                  tclsh> ::struct::list filterfor x {1 2 3 4 5} {($x % 2) == 0}
                  2 4

       ::struct::list split sequence cmdprefix ?passVar failVar?
              This  is a variant of method filter, see above. Instead of returning just the elements passing the
              test we get lists of both passing and failing elements.

              If no variable names are specified then the result of the command will be a  list  containing  the
              list  of passing elements, and the list of failing elements, in this order. Otherwise the lists of
              passing and failing elements are stored into the two specified variables, and the result will be a
              list  containing  two numbers, the number of elements passing the test, and the number of elements
              failing, in this order.

              The interface to the test is the same as used by filter.

       ::struct::list fold sequence initialvalue cmdprefix
              The subcommand takes a sequence to operate on, an arbitrary string initial  value  and  a  command
              prefix (cmdprefix) specifying an operation.

              The command prefix will be evaluated with two words appended to it. The second of these words will
              always be an element of the sequence. The evaluation takes place in the context of the  caller  of
              the subcommand.

              It  then  reduces  the  sequence  into  a single value through repeated application of the command
              prefix and returns that value. This reduction is done by

              1      Application of the command to the initial value and the first element of the list.

              2      Application of the command to the result of the last call and the  second  element  of  the
                     list.

              ...

              i      Application of the command to the result of the last call and the i'th element of the list.

              ...

              end    Application of the command to the result of the last call and the last element of the list.
                     The result of this call is returned as the result of the subcommand.

                  tclsh> # summing the elements in a list.
                  tclsh> proc + {a b} {expr {$a + $b}}
                  tclsh> ::struct::list fold {1 2 3 4 5} 0 +
                  15

       ::struct::list shift listvar
              The subcommand takes the list contained in the variable named by listvar and shifts  it  down  one
              element.  After the call listvar will contain a list containing the second to last elements of the
              input list. The first element of the ist is returned as the result of the  command.  Shifting  the
              empty list does nothing.

       ::struct::list iota n
              The  subcommand  returns  a list containing the integer numbers in the range [0,n). The element at
              index i of the list contain the number i.

              For "n == 0" an empty list will be returned.

       ::struct::list equal a b
              The subcommand compares the two lists a and b for equality. In other words, they have to be of the
              same  length  and have to contain the same elements in the same order. If an element is a list the
              same definition of equality applies recursively.

              A boolean value will be returned as the result of the command.  This value will be true if the two
              lists are equal, and false else.

       ::struct::list repeat size element1 ?element2 element3...?
              The  subcommand  creates  a list of length "size * number of elements" by repeating size times the
              sequence of elements element1 element2 ....  size must be a positive integer, elementn can be  any
              Tcl  value.   Note that repeat 1 arg ...  is identical to list arg ..., though the arg is required
              with repeat.

              Examples:

                  tclsh> ::struct::list repeat 3 a
                  a a a
                  tclsh> ::struct::list repeat 3 [::struct::list repeat 3 0]
                  {0 0 0} {0 0 0} {0 0 0}
                  tclsh> ::struct::list repeat 3 a b c
                  a b c a b c a b c
                  tclsh> ::struct::list repeat 3 [::struct::list repeat 2 a] b c
                  {a a} b c {a a} b c {a a} b c

       ::struct::list repeatn value size...
              The subcommand creates a (nested) list containing the value in all positions. The exact  size  and
              degree  of  nesting  is  determined by the size arguments, all of which have to be integer numbers
              greater than or equal to zero.

              A single argument size which is a list of more than one element will be treated as  if  more  than
              argument size was specified.

              If  only  one  argument  size  is present the returned list will not be nested, of length size and
              contain value in all positions.  If more than one size argument is present the returned list  will
              be nested, and of the length specified by the last size argument given to it. The elements of that
              list are defined as the result of Repeat for the same arguments, but  with  the  last  size  value
              removed.

              An empty list will be returned if no size arguments are present.

                  tclsh> ::struct::list repeatn  0 3 4
                  {0 0 0} {0 0 0} {0 0 0} {0 0 0}
                  tclsh> ::struct::list repeatn  0 {3 4}
                  {0 0 0} {0 0 0} {0 0 0} {0 0 0}
                  tclsh> ::struct::list repeatn  0 {3 4 5}
                  {{0 0 0} {0 0 0} {0 0 0} {0 0 0}} {{0 0 0} {0 0 0} {0 0 0} {0 0 0}} {{0 0 0} {0 0 0} {0 0 0} {0 0 0}} {{0 0 0} {0 0 0} {0 0 0} {0 0 0}} {{0 0 0} {0 0 0} {0 0 0} {0 0 0}}

       ::struct::list dbJoin ?-inner|-left|-right|-full? ?-keys varname? {keycol table}...
              The  method  performs  a  table  join according to relational algebra. The execution of any of the
              possible outer join operation is triggered by the presence of  either  option  -left,  -right,  or
              -full.  If  none of these options is present a regular inner join will be performed. This can also
              be triggered by specifying -inner. The various possible join operations are explained in detail in
              section TABLE JOIN.

              If  the  -keys  is  present its argument is the name of a variable to store the full list of found
              keys into. Depending on the exact nature of the input table and the join mode the output table may
              not  contain  all  the  keys by default. In such a case the caller can declare a variable for this
              information and then insert it into the output table on its own, as she will have more information
              about the placement than this command.

              What is left to explain is the format of the arguments.

              The  keycol arguments are the indices of the columns in the tables which contain the key values to
              use for the joining. Each argument applies to  the  table  following  immediately  after  it.  The
              columns  are  counted  from  0,  which  references the first column. The table associated with the
              column index has to have at least keycol+1 columns. An error will be thrown if there are less.

              The table arguments represent a table or matrix of rows and columns of values.  We  use  the  same
              representation  as  generated and consumed by the methods get rect and set rect of matrix objects.
              In other words, each argument is a list, representing the whole matrix.  Its  elements  are  lists
              too,  each  representing a single rows of the matrix. The elements of the row-lists are the column
              values.

              The table resulting from the join operation is returned as the result of the command. We  use  the
              same representation as described above for the input tables.

       ::struct::list dbJoinKeyed ?-inner|-left|-right|-full? ?-keys varname? table...
              The  operations  performed  by  this  method  are the same as described above for dbJoin. The only
              difference is in the specification of the keys to use. Instead of using  column  indices  separate
              from  the table here the keys are provided within the table itself. The row elements in each table
              are not the lists of column values, but a two-element list where the second element is the regular
              list of column values and the first element is the key to use.

       ::struct::list swap listvar i j
              The  subcommand  exchanges  the elements at the indices i and j in the list stored in the variable
              named by listvar. The list is  modified  in  place,  and  also  returned  as  the  result  of  the
              subcommand.

       ::struct::list firstperm list
              This subcommand returns the lexicographically first permutation of the input list.

       ::struct::list nextperm perm
              This subcommand accepts a permutation of a set of elements (provided by perm) and returns the next
              permutatation in lexicographic sequence.

              The algorithm used here is by Donal E. Knuth, see section REFERENCES for details.

       ::struct::list permutations list
              This subcommand returns a list containing all permutations of  the  input  list  in  lexicographic
              order.

       ::struct::list foreachperm var list body
              This  subcommand  executes  the  script  body once for each permutation of the specified list. The
              permutations are visited in lexicographic order, and the variable var is set  to  the  permutation
              for which body is currently executed. The result of the loop command is the empty string.

LONGEST COMMON SUBSEQUENCE AND FILE COMPARISON

       The  longestCommonSubsequence  subcommand  forms  the  core  of  a flexible system for doing differential
       comparisons of files, similar to the capability offered by the Unix command diff.  While  this  procedure
       is quite rapid for many tasks of file comparison, its performance degrades severely if sequence2 contains
       many equal elements (as, for instance, when using this procedure to compare two files, a quarter of whose
       lines are blank.  This drawback is intrinsic to the algorithm used (see the Reference for details).

       One  approach  to  dealing  with  the  performance  problem  that  is  sometimes effective in practice is
       arbitrarily to exclude elements that appear more than a certain number of times.  This number is provided
       as  the  maxOccurs parameter.  If frequent lines are excluded in this manner, they will not appear in the
       common subsequence that is computed; the result will be the  longest  common  subsequence  of  infrequent
       elements.   The procedure longestCommonSubsequence2 implements this heuristic.  It functions as a wrapper
       around longestCommonSubsequence; it computes the longest common subsequence of infrequent  elements,  and
       then  subdivides  the  subsequences  that  lie between the matches to approximate the true longest common
       subsequence.

TABLE JOIN

       This is an operation from relational algebra for relational databases.

       The easiest way to understand the regular inner join is that it creates the cartesian product of all  the
       tables  involved  first and then keeps only all those rows in the resulting table for which the values in
       the specified key columns are equal to each other.

       Implementing this description naively, i.e. as described above will generate a huge intermediate  result.
       To  avoid this the cartesian product and the filtering of row are done at the same time. What is required
       is a fast way to determine if a key is present in a table. In  a  true  database  this  is  done  through
       indices. Here we use arrays internally.

       An  outer  join is an extension of the inner join for two tables. There are three variants of outerjoins,
       called left, right, and full outer joins. Their result always contains all rows from an  inner  join  and
       then some additional rows.

       [1]    For the left outer join the additional rows are all rows from the left table for which there is no
              key in the right table. They are joined to an empty row of the right table to fit  them  into  the
              result.

       [2]    For  the right outer join the additional rows are all rows from the right table for which there is
              no key in the left table. They are joined to an empty row of the left table to fit them  into  the
              result.

       [3]    The  full  outer join combines both left and right outer join. In other words, the additional rows
              are as defined for left outer join, and right outer join, combined.

       We extend all the joins from two to n tables (n > 2) by executing

                  (...((table1 join table2) join table3) ...) join tableN

       Examples for all the joins:

                  Inner Join

                  {0 foo}              {0 bagel}    {0 foo   0 bagel}
                  {1 snarf} inner join {1 snatz}  = {1 snarf 1 snatz}
                  {2 blue}             {3 driver}

                  Left Outer Join

                  {0 foo}                   {0 bagel}    {0 foo   0 bagel}
                  {1 snarf} left outer join {1 snatz}  = {1 snarf 1 snatz}
                  {2 blue}                  {3 driver}   {2 blue  {} {}}

                  Right Outer Join

                  {0 foo}                    {0 bagel}    {0 foo   0 bagel}
                  {1 snarf} right outer join {1 snatz}  = {1 snarf 1 snatz}
                  {2 blue}                   {3 driver}   {{} {}   3 driver}

                  Full Outer Join

                  {0 foo}                   {0 bagel}    {0 foo   0 bagel}
                  {1 snarf} full outer join {1 snatz}  = {1 snarf 1 snatz}
                  {2 blue}                  {3 driver}   {2 blue  {} {}}
                                                         {{} {}   3 driver}

REFERENCES

       [1]    J. W. Hunt and M. D. McIlroy, "An algorithm for differential file comparison,"  Comp.  Sci.  Tech.
              Rep. #41, Bell Telephone Laboratories (1976). Available on the Web at the second author's personal
              site: http://www.cs.dartmouth.edu/~doug/

       [2]    Donald E. Knuth, "Fascicle 2b of 'The Art of Computer Programming' volume 4". Available on the Web
              at the author's personal site: http://www-cs-faculty.stanford.edu/~knuth/fasc2b.ps.gz.

BUGS, IDEAS, FEEDBACK

       This  document,  and  the package it describes, will undoubtedly contain bugs and other problems.  Please
       report such in the category struct :: list of the Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist].
       Please also report any ideas for enhancements you may have for either package and/or documentation.

       When proposing code changes, please provide unified diffs, i.e the output of diff -u.

       Note  further  that  attachments  are strongly preferred over inlined patches. Attachments can be made by
       going to the Edit form of the ticket immediately after its creation, and then using the left-most  button
       in the secondary navigation bar.

KEYWORDS

       Fisher-Yates, assign, common, comparison, diff, differential, equal, equality, filter, first permutation,
       flatten, folding, full outer join, generate permutations,  inner  join,  join,  left  outer  join,  list,
       longest   common  subsequence,  map,  next  permutation,  outer  join,  permutation,  reduce,  repeating,
       repetition, reshuffle, reverse, right outer join, shuffle, subsequence, swapping

CATEGORY

       Data structures

COPYRIGHT

       Copyright (c) 2003-2005 by Kevin B. Kenny. All rights reserved
       Copyright (c) 2003-2012 Andreas Kupries <andreas_kupries@users.sourceforge.net>