Provided by: tcllib_1.17-dfsg-1_all bug

NAME

       struct::list - Procedures for manipulating lists

SYNOPSIS

       package require Tcl  8.4

       package require struct::list  ?1.8.3?

       ::struct::list longestCommonSubsequence sequence1 sequence2 ?maxOccurs?

       ::struct::list longestCommonSubsequence2 sequence1 sequence2 ?maxOccurs?

       ::struct::list lcsInvert lcsData len1 len2

       ::struct::list lcsInvert2 lcs1 lcs2 len1 len2

       ::struct::list lcsInvertMerge lcsData len1 len2

       ::struct::list lcsInvertMerge2 lcs1 lcs2 len1 len2

       ::struct::list reverse sequence

       ::struct::list shuffle list

       ::struct::list assign sequence varname ?varname?...

       ::struct::list flatten ?-full? ?--? sequence

       ::struct::list map sequence cmdprefix

       ::struct::list mapfor var sequence script

       ::struct::list filter sequence cmdprefix

       ::struct::list filterfor var sequence expr

       ::struct::list split sequence cmdprefix ?passVar failVar?

       ::struct::list fold sequence initialvalue cmdprefix

       ::struct::list shift listvar

       ::struct::list iota n

       ::struct::list equal a b

       ::struct::list repeat size element1 ?element2 element3...?

       ::struct::list repeatn value size...

       ::struct::list dbJoin ?-inner|-left|-right|-full? ?-keys varname? {keycol table}...

       ::struct::list dbJoinKeyed ?-inner|-left|-right|-full? ?-keys varname? table...

       ::struct::list swap listvar i j

       ::struct::list firstperm list

       ::struct::list nextperm perm

       ::struct::list permutations list

       ::struct::list foreachperm var list body

_________________________________________________________________________________________________

DESCRIPTION

       The  ::struct::list  namespace  contains several useful commands for processing Tcl lists.
       Generally speaking, they implement algorithms more complex or specialized  than  the  ones
       provided by Tcl itself.

       It  exports  only  a  single command, struct::list. All functionality provided here can be
       reached through a subcommand of this command.

COMMANDS

       ::struct::list longestCommonSubsequence sequence1 sequence2 ?maxOccurs?
              Returns the longest common subsequence of elements in the two lists  sequence1  and
              sequence2.  If  the  maxOccurs  parameter  is  provided,  the common subsequence is
              restricted to elements that occur no more than maxOccurs times in sequence2.

              The return value is a list of two lists of equal length. The first  sublist  is  of
              indices  into sequence1, and the second sublist is of indices into sequence2.  Each
              corresponding pair of indices corresponds to equal elements in the  sequences;  the
              sequence returned is the longest possible.

       ::struct::list longestCommonSubsequence2 sequence1 sequence2 ?maxOccurs?
              Returns  an  approximation  to  the  longest common sequence of elements in the two
              lists sequence1  and  sequence2.   If  the  maxOccurs  parameter  is  omitted,  the
              subsequence  computed  is  exactly  the  longest common subsequence; otherwise, the
              longest common subsequence is approximated by first determining the longest  common
              sequence  of  only  those  elements  that  occur  no  more  than maxOccurs times in
              sequence2, and then using that result to  align  the  two  lists,  determining  the
              longest common subsequences of the sublists between the two elements.

              As  with longestCommonSubsequence, the return value is a list of two lists of equal
              length.  The first sublist is of indices into sequence1, and the second sublist  is
              of indices into sequence2.  Each corresponding pair of indices corresponds to equal
              elements  in  the  sequences.   The  sequence  approximates  the   longest   common
              subsequence.

       ::struct::list lcsInvert lcsData len1 len2
              This command takes a description of a longest common subsequence (lcsData), inverts
              it, and returns the result. Inversion means here that as the input describes  which
              parts  of  the  two  sequences  are  identical the output describes the differences
              instead.

              To be fully defined the lengths of the two sequences  have  to  be  known  and  are
              specified through len1 and len2.

              The  result  is  a  list  where each element describes one chunk of the differences
              between the two sequences. This description is a list containing three elements,  a
              type  and  two  pairs of indices into sequence1 and sequence2 respectively, in this
              order.  The type can be one of three values:

              added  Describes an addition. I.e. items which are  missing  in  sequence1  can  be
                     found  in sequence2.  The pair of indices into sequence1 describes where the
                     added range had been expected to be in sequence1. The first index refers  to
                     the  item  just  before  the added range, and the second index refers to the
                     item just after the  added  range.   The  pair  of  indices  into  sequence2
                     describes  the  range  of  items which has been added to it. The first index
                     refers to the first item in the range, and the second index  refers  to  the
                     last item in the range.

              deleted
                     Describes  a  deletion.  I.e.  items which are in sequence1 are missing from
                     sequence2.  The pair of indices into sequence1 describes the range of  items
                     which  has  been  deleted.  The  first index refers to the first item in the
                     range, and the second index refers to the last item in the range.  The  pair
                     of  indices  into  sequence2  describes  where  the  deleted  range had been
                     expected to be in sequence2. The first index refers to the item just  before
                     the  deleted  range,  and the second index refers to the item just after the
                     deleted range.

              changed
                     Describes a general change. I.e a range  of  items  in  sequence1  has  been
                     replaced  by  a  different range of items in sequence2.  The pair of indices
                     into sequence1 describes the range of items which  has  been  replaced.  The
                     first  index  refers  to  the  first item in the range, and the second index
                     refers to the last item in the range.  The pair of  indices  into  sequence2
                     describes  the  range of items replacing the original range. Again the first
                     index refers to the first item in the range, and the second index refers  to
                     the last item in the range.

                  sequence 1 = {a b r a c a d a b r a}
                  lcs 1      =   {1 2   4 5     8 9 10}
                  lcs 2      =   {0 1   3 4     5 6 7}
                  sequence 2 =   {b r i c a     b r a c}

                  Inversion  = {{deleted  {0  0} {-1 0}}
                                {changed  {3  3}  {2 2}}
                                {deleted  {6  7}  {4 5}}
                                {added   {10 11}  {8 8}}}

              Notes:

              •      An index of -1 in a deleted chunk refers to just before the first element of
                     the second sequence.

              •      Also an index equal to the length of the first sequence in  an  added  chunk
                     refers to just behind the end of the sequence.

       ::struct::list lcsInvert2 lcs1 lcs2 len1 len2
              Similar  to  lcsInvert.  Instead  of  directly  taking  the  result  of  a  call to
              longestCommonSubsequence this subcommand expects the indices for the two  sequences
              in two separate lists.

       ::struct::list lcsInvertMerge lcsData len1 len2
              Similar  to  lcsInvert.  It returns essentially the same structure as that command,
              except that it may contain chunks of type unchanged too.

              These new chunks describe the parts which are unchanged between the two  sequences.
              This means that the result of this command describes both the changed and unchanged
              parts of the two sequences in one structure.

                  sequence 1 = {a b r a c a d a b r a}
                  lcs 1      =   {1 2   4 5     8 9 10}
                  lcs 2      =   {0 1   3 4     5 6 7}
                  sequence 2 =   {b r i c a     b r a c}

                  Inversion/Merge  = {{deleted   {0  0} {-1 0}}
                                      {unchanged {1  2}  {0 1}}
                                      {changed   {3  3}  {2 2}}
                                      {unchanged {4  5}  {3 4}}
                                      {deleted   {6  7}  {4 5}}
                                      {unchanged {8 10}  {5 7}}
                                      {added    {10 11}  {8 8}}}

       ::struct::list lcsInvertMerge2 lcs1 lcs2 len1 len2
              Similar to lcsInvertMerge. Instead of directly taking  the  result  of  a  call  to
              longestCommonSubsequence  this subcommand expects the indices for the two sequences
              in two separate lists.

       ::struct::list reverse sequence
              The subcommand takes a single sequence as  argument  and  returns  a  new  sequence
              containing the elements of the input sequence in reverse order.

       ::struct::list shuffle list
              The  subcommand  takes  a list and returns a copy of that list with the elements it
              contains in random order. Every possible ordering of elements is equally likely  to
              be generated. The Fisher-Yates shuffling algorithm is used internally.

       ::struct::list assign sequence varname ?varname?...
              The  subcommand  assigns  the  first n elements of the input sequence to the one or
              more variables whose names were listed after the sequence, where n is the number of
              specified variables.

              If  there  are more variables specified than there are elements in the sequence the
              empty string will be assigned to the superfluous variables.

              If there are more elements in  the  sequence  than  variable  names  specified  the
              subcommand returns a list containing the unassigned elements. Else an empty list is
              returned.

                  tclsh> ::struct::list assign {a b c d e} foo bar
                  c d e
                  tclsh> set foo
                  a
                  tclsh> set bar
                  b

       ::struct::list flatten ?-full? ?--? sequence
              The subcommand takes a single sequence and returns a new sequence where  one  level
              of nesting was removed from the input sequence. In other words, the sublists in the
              input sequence are replaced by their elements.

              The subcommand will remove any nesting it finds if the option -full is specified.

                  tclsh> ::struct::list flatten {1 2 3 {4 5} {6 7} {{8 9}} 10}
                  1 2 3 4 5 6 7 {8 9} 10
                  tclsh> ::struct::list flatten -full {1 2 3 {4 5} {6 7} {{8 9}} 10}
                  1 2 3 4 5 6 7 8 9 10

       ::struct::list map sequence cmdprefix
              The subcommand takes a sequence to operate on  and  a  command  prefix  (cmdprefix)
              specifying an operation, applies the command prefix to each element of the sequence
              and returns a sequence consisting of the results of that application.

              The command prefix will be evaluated  with  a  single  word  appended  to  it.  The
              evaluation takes place in the context of the caller of the subcommand.

                  tclsh> # squaring all elements in a list

                  tclsh> proc sqr {x} {expr {$x*$x}}
                  tclsh> ::struct::list map {1 2 3 4 5} sqr
                  1 4 9 16 25

                  tclsh> # Retrieving the second column from a matrix
                  tclsh> # given as list of lists.

                  tclsh> proc projection {n list} {::lindex $list $n}
                  tclsh> ::struct::list map {{a b c} {1 2 3} {d f g}} {projection 1}
                  b 2 f

       ::struct::list mapfor var sequence script
              The  subcommand takes a sequence to operate on and a tcl script, applies the script
              to each element of the sequence and returns a sequence consisting of the results of
              that application.

              The  script  will  be  evaluated  as is, and has access to the current list element
              through the specified iteration variable var. The evaluation  takes  place  in  the
              context of the caller of the subcommand.

                  tclsh> # squaring all elements in a list

                  tclsh> ::struct::list mapfor x {1 2 3 4 5} {
                expr {$x * $x}
                  }
                  1 4 9 16 25

                  tclsh> # Retrieving the second column from a matrix
                  tclsh> # given as list of lists.

                  tclsh> ::struct::list mapfor x {{a b c} {1 2 3} {d f g}} {
                lindex $x 1
                  }
                  b 2 f

       ::struct::list filter sequence cmdprefix
              The  subcommand  takes  a  sequence  to operate on and a command prefix (cmdprefix)
              specifying an operation, applies the command prefix to each element of the sequence
              and  returns  a  sequence  consisting of all elements of the sequence for which the
              command prefix returned true.   In  other  words,  this  command  filters  out  all
              elements  of  the  input sequence which fail the test the cmdprefix represents, and
              returns the remaining elements.

              The command prefix will be evaluated  with  a  single  word  appended  to  it.  The
              evaluation takes place in the context of the caller of the subcommand.

                  tclsh> # removing all odd numbers from the input

                  tclsh> proc even {x} {expr {($x % 2) == 0}}
                  tclsh> ::struct::list filter {1 2 3 4 5} even
                  2 4

       Note:  The  filter  is a specialized application of fold where the result is extended with
       the current item or not, depending o nthe result of the test.

       ::struct::list filterfor var sequence expr
              The subcommand takes  a  sequence  to  operate  on  and  a  tcl  expression  (expr)
              specifying  a  condition,  applies the conditionto each element of the sequence and
              returns a sequence consisting of  all  elements  of  the  sequence  for  which  the
              expression returned true.  In other words, this command filters out all elements of
              the input sequence which fail the test the condition expr represents,  and  returns
              the remaining elements.

              The  expression will be evaluated as is, and has access to the current list element
              through the specified iteration variable var. The evaluation  takes  place  in  the
              context of the caller of the subcommand.

                  tclsh> # removing all odd numbers from the input

                  tclsh> ::struct::list filterfor x {1 2 3 4 5} {($x % 2) == 0}
                  2 4

       ::struct::list split sequence cmdprefix ?passVar failVar?
              This  is  a  variant  of  method  filter,  see above. Instead of returning just the
              elements passing the test we get lists of both passing and failing elements.

              If no variable names are specified then the result of the command will  be  a  list
              containing  the list of passing elements, and the list of failing elements, in this
              order. Otherwise the lists of passing and failing elements are stored into the  two
              specified  variables,  and  the  result  will be a list containing two numbers, the
              number of elements passing the test, and the number of elements  failing,  in  this
              order.

              The interface to the test is the same as used by filter.

       ::struct::list fold sequence initialvalue cmdprefix
              The  subcommand  takes  a sequence to operate on, an arbitrary string initial value
              and a command prefix (cmdprefix) specifying an operation.

              The command prefix will be evaluated with two words appended to it. The  second  of
              these  words  will always be an element of the sequence. The evaluation takes place
              in the context of the caller of the subcommand.

              It then reduces the sequence into a single value through  repeated  application  of
              the command prefix and returns that value. This reduction is done by

              1      Application of the command to the initial value and the first element of the
                     list.

              2      Application of the command to the result of the last  call  and  the  second
                     element of the list.

              ...

              i      Application  of  the  command  to  the  result of the last call and the i'th
                     element of the list.

              ...

              end    Application of the command to the result of  the  last  call  and  the  last
                     element  of  the  list. The result of this call is returned as the result of
                     the subcommand.

                  tclsh> # summing the elements in a list.
                  tclsh> proc + {a b} {expr {$a + $b}}
                  tclsh> ::struct::list fold {1 2 3 4 5} 0 +
                  15

       ::struct::list shift listvar
              The subcommand takes the list contained in the variable named by listvar and shifts
              it  down  one  element.   After the call listvar will contain a list containing the
              second to last elements of the input list. The first element of the ist is returned
              as the result of the command. Shifting the empty list does nothing.

       ::struct::list iota n
              The  subcommand  returns  a list containing the integer numbers in the range [0,n).
              The element at index i of the list contain the number i.

              For "n == 0" an empty list will be returned.

       ::struct::list equal a b
              The subcommand compares the two lists a and b for equality. In  other  words,  they
              have  to  be  of  the same length and have to contain the same elements in the same
              order. If an element is a list the same definition of equality applies recursively.

              A boolean value will be returned as the result of the command.  This value will  be
              true if the two lists are equal, and false else.

       ::struct::list repeat size element1 ?element2 element3...?
              The  subcommand  creates  a list of length "size * number of elements" by repeating
              size times the sequence of elements element1 element2 ....  size must be a positive
              integer,  elementn  can be any Tcl value.  Note that repeat 1 arg ...  is identical
              to list arg ..., though the arg is required with repeat.

              Examples:

                  tclsh> ::struct::list repeat 3 a
                  a a a
                  tclsh> ::struct::list repeat 3 [::struct::list repeat 3 0]
                  {0 0 0} {0 0 0} {0 0 0}
                  tclsh> ::struct::list repeat 3 a b c
                  a b c a b c a b c
                  tclsh> ::struct::list repeat 3 [::struct::list repeat 2 a] b c
                  {a a} b c {a a} b c {a a} b c

       ::struct::list repeatn value size...
              The subcommand creates a (nested) list containing the value in all  positions.  The
              exact  size and degree of nesting is determined by the size arguments, all of which
              have to be integer numbers greater than or equal to zero.

              A single argument size which is a list of more than one element will be treated  as
              if more than argument size was specified.

              If  only  one  argument  size  is  present the returned list will not be nested, of
              length size and contain value in all positions.  If more than one size argument  is
              present  the  returned list will be nested, and of the length specified by the last
              size argument given to it. The elements of that list are defined as the  result  of
              Repeat for the same arguments, but with the last size value removed.

              An empty list will be returned if no size arguments are present.

                  tclsh> ::struct::list repeatn  0 3 4
                  {0 0 0} {0 0 0} {0 0 0} {0 0 0}
                  tclsh> ::struct::list repeatn  0 {3 4}
                  {0 0 0} {0 0 0} {0 0 0} {0 0 0}
                  tclsh> ::struct::list repeatn  0 {3 4 5}
                  {{0 0 0} {0 0 0} {0 0 0} {0 0 0}} {{0 0 0} {0 0 0} {0 0 0} {0 0 0}} {{0 0 0} {0 0 0} {0 0 0} {0 0 0}} {{0 0 0} {0 0 0} {0 0 0} {0 0 0}} {{0 0 0} {0 0 0} {0 0 0} {0 0 0}}

       ::struct::list dbJoin ?-inner|-left|-right|-full? ?-keys varname? {keycol table}...
              The  method performs a table join according to relational algebra. The execution of
              any of the possible outer join operation is triggered by  the  presence  of  either
              option -left, -right, or -full. If none of these options is present a regular inner
              join will be performed. This can  also  be  triggered  by  specifying  -inner.  The
              various possible join operations are explained in detail in section TABLE JOIN.

              If  the  -keys  is present its argument is the name of a variable to store the full
              list of found keys into. Depending on the exact nature of the input table  and  the
              join  mode the output table may not contain all the keys by default. In such a case
              the caller can declare a variable for this information and then insert it into  the
              output table on its own, as she will have more information about the placement than
              this command.

              What is left to explain is the format of the arguments.

              The keycol arguments are the indices of the columns in the tables which contain the
              key  values  to  use  for the joining. Each argument applies to the table following
              immediately after it. The columns are counted from 0, which  references  the  first
              column.  The  table  associated with the column index has to have at least keycol+1
              columns. An error will be thrown if there are less.

              The table arguments represent a table or matrix of rows and columns of  values.  We
              use  the  same representation as generated and consumed by the methods get rect and
              set rect of matrix objects. In other words, each argument is a  list,  representing
              the  whole  matrix.  Its elements are lists too, each representing a single rows of
              the matrix. The elements of the row-lists are the column values.

              The table resulting from the join operation  is  returned  as  the  result  of  the
              command. We use the same representation as described above for the input tables.

       ::struct::list dbJoinKeyed ?-inner|-left|-right|-full? ?-keys varname? table...
              The operations performed by this method are the same as described above for dbJoin.
              The only difference is in the specification of the keys to use.  Instead  of  using
              column  indices separate from the table here the keys are provided within the table
              itself. The row elements in each table are not the lists of column  values,  but  a
              two-element  list where the second element is the regular list of column values and
              the first element is the key to use.

       ::struct::list swap listvar i j
              The subcommand exchanges the elements at the indices i and j in the list stored  in
              the  variable named by listvar. The list is modified in place, and also returned as
              the result of the subcommand.

       ::struct::list firstperm list
              This subcommand returns the lexicographically first permutation of the input list.

       ::struct::list nextperm perm
              This subcommand accepts a permutation of a set of elements (provided by  perm)  and
              returns the next permutatation in lexicographic sequence.

              The algorithm used here is by Donal E. Knuth, see section REFERENCES for details.

       ::struct::list permutations list
              This  subcommand  returns  a  list containing all permutations of the input list in
              lexicographic order.

       ::struct::list foreachperm var list body
              This subcommand executes the script body once for each permutation of the specified
              list.  The permutations are visited in lexicographic order, and the variable var is
              set to the permutation for which body is currently executed. The result of the loop
              command is the empty string.

LONGEST COMMON SUBSEQUENCE AND FILE COMPARISON

       The  longestCommonSubsequence  subcommand  forms  the  core of a flexible system for doing
       differential comparisons of files, similar to the capability offered by the  Unix  command
       diff.   While  this  procedure  is  quite  rapid  for  many  tasks of file comparison, its
       performance degrades severely if sequence2 contains many equal elements (as, for instance,
       when  using this procedure to compare two files, a quarter of whose lines are blank.  This
       drawback is intrinsic to the algorithm used (see the Reference for details).

       One approach to dealing with the  performance  problem  that  is  sometimes  effective  in
       practice  is  arbitrarily  to  exclude  elements that appear more than a certain number of
       times.  This number is provided  as  the  maxOccurs  parameter.   If  frequent  lines  are
       excluded  in this manner, they will not appear in the common subsequence that is computed;
       the result will be the longest common subsequence of infrequent elements.   The  procedure
       longestCommonSubsequence2  implements  this  heuristic.   It functions as a wrapper around
       longestCommonSubsequence;  it  computes  the  longest  common  subsequence  of  infrequent
       elements, and then subdivides the subsequences that lie between the matches to approximate
       the true longest common subsequence.

TABLE JOIN

       This is an operation from relational algebra for relational databases.

       The easiest way to understand the regular inner join is  that  it  creates  the  cartesian
       product  of  all  the  tables  involved  first  and  then keeps only all those rows in the
       resulting table for which the values in the specified key columns are equal to each other.

       Implementing this description naively, i.e.  as  described  above  will  generate  a  huge
       intermediate result. To avoid this the cartesian product and the filtering of row are done
       at the same time. What is required is a fast way to determine if a key  is  present  in  a
       table. In a true database this is done through indices. Here we use arrays internally.

       An  outer  join is an extension of the inner join for two tables. There are three variants
       of outerjoins, called left, right, and full outer joins. Their result always contains  all
       rows from an inner join and then some additional rows.

       [1]    For  the  left  outer join the additional rows are all rows from the left table for
              which there is no key in the right table. They are joined to an empty  row  of  the
              right table to fit them into the result.

       [2]    For  the right outer join the additional rows are all rows from the right table for
              which there is no key in the left table. They are joined to an  empty  row  of  the
              left table to fit them into the result.

       [3]    The  full  outer  join combines both left and right outer join. In other words, the
              additional rows are as defined for left outer join, and right outer join, combined.

       We extend all the joins from two to n tables (n > 2) by executing

                  (...((table1 join table2) join table3) ...) join tableN

       Examples for all the joins:

                  Inner Join

                  {0 foo}              {0 bagel}    {0 foo   0 bagel}
                  {1 snarf} inner join {1 snatz}  = {1 snarf 1 snatz}
                  {2 blue}             {3 driver}

                  Left Outer Join

                  {0 foo}                   {0 bagel}    {0 foo   0 bagel}
                  {1 snarf} left outer join {1 snatz}  = {1 snarf 1 snatz}
                  {2 blue}                  {3 driver}   {2 blue  {} {}}

                  Right Outer Join

                  {0 foo}                    {0 bagel}    {0 foo   0 bagel}
                  {1 snarf} right outer join {1 snatz}  = {1 snarf 1 snatz}
                  {2 blue}                   {3 driver}   {{} {}   3 driver}

                  Full Outer Join

                  {0 foo}                   {0 bagel}    {0 foo   0 bagel}
                  {1 snarf} full outer join {1 snatz}  = {1 snarf 1 snatz}
                  {2 blue}                  {3 driver}   {2 blue  {} {}}
                                                         {{} {}   3 driver}

REFERENCES

       [1]    J. W. Hunt and M. D. McIlroy, "An  algorithm  for  differential  file  comparison,"
              Comp. Sci. Tech. Rep. #41, Bell Telephone Laboratories (1976). Available on the Web
              at the second author's personal site: http://www.cs.dartmouth.edu/~doug/

       [2]    Donald E. Knuth, "Fascicle 2b of 'The  Art  of  Computer  Programming'  volume  4".
              Available   on   the   Web   at   the   author's   personal   site:  http://www-cs-
              faculty.stanford.edu/~knuth/fasc2b.ps.gz.

BUGS, IDEAS, FEEDBACK

       This document, and the package it describes,  will  undoubtedly  contain  bugs  and  other
       problems.   Please  report  such  in  the  category  struct :: list of the Tcllib Trackers
       [http://core.tcl.tk/tcllib/reportlist].  Please also report any ideas for enhancements you
       may have for either package and/or documentation.

KEYWORDS

       Fisher-Yates,  assign,  common,  comparison,  diff, differential, equal, equality, filter,
       first permutation, flatten, folding, full outer join, generate permutations,  inner  join,
       join,  left  outer  join,  list,  longest common subsequence, map, next permutation, outer
       join, permutation, reduce, repeating, repetition, reshuffle, reverse,  right  outer  join,
       shuffle, subsequence, swapping

CATEGORY

       Data structures

COPYRIGHT

       Copyright (c) 2003-2005 by Kevin B. Kenny. All rights reserved
       Copyright (c) 2003-2012 Andreas Kupries <andreas_kupries@users.sourceforge.net>