Provided by: manpages-posix_2013a-2_all bug

PROLOG

       This  manual  page  is part of the POSIX Programmer's Manual.  The Linux implementation of
       this interface may differ (consult the corresponding Linux  manual  page  for  details  of
       Linux behavior), or the interface may not be implemented on Linux.

NAME

       uniq — report or filter out repeated lines in a file

SYNOPSIS

       uniq [−c|−d|−u] [−f fields] [−s char] [input_file [output_file]]

DESCRIPTION

       The  uniq utility shall read an input file comparing adjacent lines, and write one copy of
       each input line on the output. The second and succeeding copies of repeated adjacent input
       lines  shall  not  be  written.  The trailing <newline> of each line in the input shall be
       ignored when doing comparisons.

       Repeated lines in the input shall not be detected if they are not adjacent.

OPTIONS

       The uniq utility shall conform to the Base Definitions  volume  of  POSIX.1‐2008,  Section
       12.2,  Utility Syntax Guidelines, except that '+' may be recognized as an option delimiter
       as well as '−'.

       The following options shall be supported:

       −c        Precede each output line with a count of the number of times the  line  occurred
                 in the input.

       −d        Suppress the writing of lines that are not repeated in the input.

       −f fields Ignore  the first fields fields on each input line when doing comparisons, where
                 fields is a positive decimal integer. A field is the maximal string  matched  by
                 the basic regular expression:

                     [[:blank:]]*[^[:blank:]]*

                 If  the  fields  option-argument  specifies  more fields than appear on an input
                 line, a null string shall be used for comparison.

       −s chars  Ignore the first chars characters when doing comparisons, where chars shall be a
                 positive  decimal  integer.  If specified in conjunction with the −f option, the
                 first chars characters after the first fields fields shall be  ignored.  If  the
                 chars  option-argument specifies more characters than remain on an input line, a
                 null string shall be used for comparison.

       −u        Suppress the writing of lines that are repeated in the input.

OPERANDS

       The following operands shall be supported:

       input_file
                 A pathname of the input file. If the input_file operand is not specified, or  if
                 the input_file is '−', the standard input shall be used.

       output_file
                 A  pathname of the output file. If the output_file operand is not specified, the
                 standard output shall be used. The results are unspecified if the file named  by
                 output_file is the file named by input_file.

STDIN

       The  standard  input  shall  be  used  only  if  no  input_file operand is specified or if
       input_file is '−'.  See the INPUT FILES section.

INPUT FILES

       The input file shall be a text file.

ENVIRONMENT VARIABLES

       The following environment variables shall affect the execution of uniq:

       LANG      Provide a default value for the internationalization variables that are unset or
                 null.   (See   the   Base  Definitions  volume  of  POSIX.1‐2008,  Section  8.2,
                 Internationalization  Variables  for  the  precedence  of   internationalization
                 variables used to determine the values of locale categories.)

       LC_ALL    If  set  to  a  non-empty  string  value,  override  the values of all the other
                 internationalization variables.

       LC_COLLATE
                 Determine the locale for ordering rules.

       LC_CTYPE  Determine the locale for the interpretation of sequences of bytes of  text  data
                 as  characters  (for example, single-byte as opposed to multi-byte characters in
                 arguments and input files) and which characters  constitute  a  <blank>  in  the
                 current locale.

       LC_MESSAGES
                 Determine  the  locale  that should be used to affect the format and contents of
                 diagnostic messages written to standard error.

       NLSPATH   Determine the location of message catalogs for the processing of LC_MESSAGES.

ASYNCHRONOUS EVENTS

       Default.

STDOUT

       The standard output shall be used if no output_file operand is  specified,  and  shall  be
       used  if  the  output_file operand is '−' and the implementation treats the '−' as meaning
       standard output. Otherwise, the standard output shall not be used.  See the  OUTPUT  FILES
       section.

STDERR

       The standard error shall be used only for diagnostic messages.

OUTPUT FILES

       If the −c option is specified, the output file shall be empty or each line shall be of the
       form:

           "%d %s", <number of duplicates>, <line>

       otherwise, the output file shall be empty or each line shall be of the form:

           "%s", <line>

EXTENDED DESCRIPTION

       None.

EXIT STATUS

       The following exit values shall be returned:

        0    The utility executed successfully.

       >0    An error occurred.

CONSEQUENCES OF ERRORS

       Default.

       The following sections are informative.

APPLICATION USAGE

       The sort utility can be used to cause repeated lines to be adjacent in the input file.

EXAMPLES

       The following input file data (but flushed left) was used for a test series on uniq:

           #01 foo0 bar0 foo1 bar1
           #02 bar0 foo1 bar1 foo1
           #03 foo0 bar0 foo1 bar1
           #04
           #05 foo0 bar0 foo1 bar1
           #06 foo0 bar0 foo1 bar1
           #07 bar0 foo1 bar1 foo0

       What follows is a series of test invocations of the uniq utility that  use  a  mixture  of
       uniq options against the input file data. These tests verify the meaning of adjacent.  The
       uniq  utility  views  the  input  data  as  a  sequence  of  strings  delimited  by  '\n'.
       Accordingly,  for  the fieldsth member of the sequence, uniq interprets unique or repeated
       adjacent lines strictly relative to the fields+1th member.

        1. This first example tests the line counting option, comparing each line  of  the  input
           file data starting from the second field:

               uniq −c −f 1 uniq_0I.t
                   1 #01 foo0 bar0 foo1 bar1
                   1 #02 bar0 foo1 bar1 foo1
                   1 #03 foo0 bar0 foo1 bar1
                   1 #04
                   2 #05 foo0 bar0 foo1 bar1
                   1 #07 bar0 foo1 bar1 foo0

           The  number  '2',  prefixing the fifth line of output, signifies that the uniq utility
           detected a pair of repeated lines. Given the input data, this can only  be  true  when
           uniq is run using the −f 1 option (which shall cause uniq to ignore the first field on
           each input line).

        2. The second example tests the option to suppress unique lines, comparing each  line  of
           the input file data starting from the second field:

               uniq −d −f 1 uniq_0I.t
               #05 foo0 bar0 foo1 bar1

        3. This  test  suppresses  repeated  lines,  comparing  each  line of the input file data
           starting from the second field:

               uniq −u −f 1 uniq_0I.t
               #01 foo0 bar0 foo1 bar1
               #02 bar0 foo1 bar1 foo1
               #03 foo0 bar0 foo1 bar1
               #04
               #07 bar0 foo1 bar1 foo0

        4. This suppresses unique lines, comparing each line of the input file data starting from
           the third character:

               uniq −d −s 2 uniq_0I.t

           In the last example, the uniq utility found no input matching the above criteria.

RATIONALE

       Some  historical implementations have limited lines to be 1080 bytes in length, which does
       not meet the implied {LINE_MAX} limit.

       Earlier versions of this standard allowed the number and +number options.  These  options
       are no longer specified by POSIX.1‐2008 but may be present in some implementations.

FUTURE DIRECTIONS

       None.

SEE ALSO

       comm, sort

       The  Base  Definitions  volume  of POSIX.1‐2008, Chapter 8, Environment Variables, Section
       12.2, Utility Syntax Guidelines

COPYRIGHT

       Portions of this text are reprinted and  reproduced  in  electronic  form  from  IEEE  Std
       1003.1,  2013  Edition,  Standard  for Information Technology -- Portable Operating System
       Interface (POSIX), The Open Group Base Specifications Issue 7, Copyright (C) 2013  by  the
       Institute  of  Electrical  and  Electronics  Engineers,  Inc and The Open Group.  (This is
       POSIX.1-2008 with the  2013  Technical  Corrigendum  1  applied.)  In  the  event  of  any
       discrepancy  between  this  version and the original IEEE and The Open Group Standard, the
       original IEEE and The Open Group Standard is the referee document. The  original  Standard
       can be obtained online at http://www.unix.org/online.html .

       Any  typographical  or  formatting errors that appear in this page are most likely to have
       been introduced during the conversion of the source files to man page  format.  To  report
       such errors, see https://www.kernel.org/doc/man-pages/reporting_bugs.html .