lunar (1) comm.1posix.gz

Provided by: manpages-posix_2017a-2_all bug

PROLOG

       This  manual  page  is part of the POSIX Programmer's Manual.  The Linux implementation of
       this interface may differ (consult the corresponding Linux  manual  page  for  details  of
       Linux behavior), or the interface may not be implemented on Linux.

NAME

       comm — select or reject lines common to two files

SYNOPSIS

       comm [-123] file1 file2

DESCRIPTION

       The  comm  utility  shall  read  file1  and  file2, which should be ordered in the current
       collating sequence, and produce three text columns as output: lines only in  file1,  lines
       only in file2, and lines in both files.

       If  the  lines  in  both  files are not ordered according to the collating sequence of the
       current locale, the results are unspecified.

       If the collating sequence of the current locale does not have  a  total  ordering  of  all
       characters  (see  the  Base Definitions volume of POSIX.1‐2017, Section 7.3.2, LC_COLLATE)
       and any lines from the input files collate equally but  are  not  identical,  comm  should
       treat  them  as different lines but may treat them as being the same. If it treats them as
       different, comm should expect them to be  ordered  according  to  a  further  byte-by-byte
       comparison  using  the collating sequence for the POSIX locale and if they are not ordered
       in this way, the output of comm can identify such lines as being both unique to file1  and
       unique to file2 instead of being in both files.

OPTIONS

       The  comm  utility  shall  conform to the Base Definitions volume of POSIX.1‐2017, Section
       12.2, Utility Syntax Guidelines.

       The following options shall be supported:

       -1        Suppress the output column of lines unique to file1.

       -2        Suppress the output column of lines unique to file2.

       -3        Suppress the output column of lines duplicated in file1 and file2.

OPERANDS

       The following operands shall be supported:

       file1     A pathname of the first file to be compared. If file1 is '-', the standard input
                 shall be used.

       file2     A  pathname  of  the  second  file to be compared. If file2 is '-', the standard
                 input shall be used.

       If both file1 and file2 refer to standard  input  or  to  the  same  FIFO  special,  block
       special, or character special file, the results are undefined.

STDIN

       The  standard  input  shall  be  used only if one of the file1 or file2 operands refers to
       standard input. See the INPUT FILES section.

INPUT FILES

       The input files shall be text files.

ENVIRONMENT VARIABLES

       The following environment variables shall affect the execution of comm:

       LANG      Provide a default value for the internationalization variables that are unset or
                 null.   (See   the   Base  Definitions  volume  of  POSIX.1‐2017,  Section  8.2,
                 Internationalization  Variables  for  the  precedence  of   internationalization
                 variables used to determine the values of locale categories.)

       LC_ALL    If  set  to  a  non-empty  string  value,  override  the values of all the other
                 internationalization variables.

       LC_COLLATE
                 Determine the locale for the collating sequence comm expects to have  been  used
                 when the input files were sorted.

       LC_CTYPE  Determine  the  locale for the interpretation of sequences of bytes of text data
                 as characters (for example, single-byte as opposed to multi-byte  characters  in
                 arguments and input files).

       LC_MESSAGES
                 Determine  the  locale  that should be used to affect the format and contents of
                 diagnostic messages written to standard error.

       NLSPATH   Determine the location of message catalogs for the processing of LC_MESSAGES.

ASYNCHRONOUS EVENTS

       Default.

STDOUT

       The comm utility shall produce output depending on the options selected. If  the  -1,  -2,
       and -3 options are all selected, comm shall write nothing to standard output.

       If the -1 option is not selected, lines contained only in file1 shall be written using the
       format:

           "%s\n", <line in file1>

       If the -2 option is not selected, lines contained only in  file2  are  written  using  the
       format:

           "%s%s\n", <lead>, <line in file2>

       where the string <lead> is as follows:

       <tab>     The -1 option is not selected.

       null string
                 The -1 option is selected.

       If the -3 option is not selected, lines contained in both files shall be written using the
       format:

           "%s%s\n", <lead>, <line in both>

       where the string <lead> is as follows:

       <tab><tab>
                 Neither the -1 nor the -2 option is selected.

       <tab>     Exactly one of the -1 and -2 options is selected.

       null string
                 Both the -1 and -2 options are selected.

       If the input files were ordered according to the collating sequence of the current locale,
       the  lines  written shall be in the collating sequence of the current locale. If the input
       files contained any lines that collated equally but were not  identical  and  within  each
       file  those  lines  were  ordered according to a further byte-by-byte comparison using the
       collating sequence for the POSIX locale, and comm treated them as  different  lines,  then
       lines  written that collate equally but are not identical should be ordered according to a
       further byte-by-byte comparison using the collating sequence for the POSIX locale.

STDERR

       The standard error shall be used only for diagnostic messages.

OUTPUT FILES

       None.

EXTENDED DESCRIPTION

       None.

EXIT STATUS

       The following exit values shall be returned:

        0    All input files were successfully output as specified.

       >0    An error occurred.

CONSEQUENCES OF ERRORS

       Default.

       The following sections are informative.

APPLICATION USAGE

       If the input files are not properly presorted, the output of comm might not be useful.

       When using comm to process pathnames, it is recommended that LC_ALL, or at least  LC_CTYPE
       and LC_COLLATE, are set to POSIX or C in the environment, since pathnames can contain byte
       sequences that do not form valid characters in some locales, in which case  the  utility's
       behavior  would  be  undefined.  In  the  POSIX  locale  each  byte is a valid single-byte
       character, and therefore this problem is avoided.

       If the collating sequence of the current locale does not have  a  total  ordering  of  all
       characters, this can affect the behavior of comm in the following ways:

        *  If  comm  treats lines as being the same only if they are identical, some lines can be
           misleadingly identified as being both unique to file1 and unique to file2.

        *  If comm treats lines as being the same if they collate equally and a line  from  file1
           collates  equally  with a line from file2 but is not identical to it, one of the lines
           is misleadingly identified as being in both files and the other is not written to  the
           output at all.

       Such  problems  can  be  avoided  by forcing the use of the POSIX locale; for example, the
       following identifies lines in both file1 and file2:

           LC_ALL=POSIX sort file1 > file1.posix
           LC_ALL=POSIX sort file2 > file2.posix
           LC_ALL=POSIX comm -12 file1.posix file2.posix | sort

       The final sort re-sorts the output of comm according to  the  collating  sequence  of  the
       original  locale.  Doing  this  might  be  difficult if more than one column is output and
       leading <blank>s cannot be ignored.

EXAMPLES

       If a file  named  xcu  contains  a  sorted  list  of  the  utilities  in  this  volume  of
       POSIX.1‐2017,  a  file named xpg3 contains a sorted list of the utilities specified in the
       X/Open Portability Guide, Issue 3, and a file named svid89 contains a sorted list  of  the
       utilities in the System V Interface Definition Third Edition:

           comm -23 xcu xpg3 | comm -23 - svid89

       would  print a list of utilities in this volume of POSIX.1‐2017 not specified by either of
       the other documents:

           comm -12 xcu xpg3 | comm -12 - svid89

       would print a list of utilities specified by all three documents, and:

           comm -12 xpg3 svid89 | comm -23 - xcu

       would print a list of utilities specified by both XPG3 and the SVID, but not specified  in
       this volume of POSIX.1‐2017.

RATIONALE

       None.

FUTURE DIRECTIONS

       A  future  version  of  this  standard  may require that if any lines from the input files
       collate equally but are not identical, then  comm  treats  them  as  different  lines  and
       expects  them  to  be  ordered  according  to  a further byte-by-byte comparison using the
       collating sequence for the POSIX locale.

       A future version of this standard may require that if the input files contained any  lines
       that collated equally but were not identical and within each file those lines were ordered
       according to a further byte-by-byte comparison using the collating sequence for the  POSIX
       locale,  then  lines  written  that  collate  equally  but  are  not identical are ordered
       according to a further byte-by-byte comparison using the collating sequence for the  POSIX
       locale.

SEE ALSO

       cmp, diff, sort, uniq

       The  Base  Definitions  volume  of  POSIX.1‐2017,  Section  7.3.2,  LC_COLLATE, Chapter 8,
       Environment Variables, Section 12.2, Utility Syntax Guidelines

       Portions of this text are reprinted and  reproduced  in  electronic  form  from  IEEE  Std
       1003.1-2017,  Standard  for  Information Technology -- Portable Operating System Interface
       (POSIX), The Open Group Base Specifications Issue 7, 2018 Edition, Copyright (C)  2018  by
       the  Institute  of  Electrical  and Electronics Engineers, Inc and The Open Group.  In the
       event of any discrepancy between this version and the original IEEE  and  The  Open  Group
       Standard,  the  original  IEEE  and  The  Open Group Standard is the referee document. The
       original Standard can be obtained online at http://www.opengroup.org/unix/online.html .

       Any typographical or formatting errors that appear in this page are most  likely  to  have
       been  introduced  during  the conversion of the source files to man page format. To report
       such errors, see https://www.kernel.org/doc/man-pages/reporting_bugs.html .