Provided by: manpages-posix_2013a-2_all bug

PROLOG

       This  manual  page  is part of the POSIX Programmer's Manual.  The Linux implementation of this interface
       may differ (consult the corresponding Linux manual page for details of Linux behavior), or the  interface
       may not be implemented on Linux.

NAME

       diff — compare two files

SYNOPSIS

       diff [−c|−e|−f|−u|−C n|−U n] [−br] file1 file2

DESCRIPTION

       The  diff  utility  shall  compare the contents of file1 and file2 and write to standard output a list of
       changes necessary to convert file1 into file2.  This list should be minimal. No output shall be  produced
       if the files are identical.

OPTIONS

       The  diff  utility  shall  conform  to the Base Definitions volume of POSIX.1‐2008, Section 12.2, Utility
       Syntax Guidelines.

       The following options shall be supported:

       −b        Cause any amount of white space at the end of a line to be treated as a single <newline>  (that
                 is, the white-space characters preceding the <newline> are ignored) and other strings of white-
                 space characters, not including <newline> characters, to compare equal.

       −c        Produce output in a form that provides three lines of copied context.

       −C n      Produce output in a form that provides n lines of copied context (where n shall be  interpreted
                 as a positive decimal integer).

       −e        Produce  output  in  a  form  suitable  as  input for the ed utility, which can then be used to
                 convert file1 into file2.

       −f        Produce output in an alternative form, similar in format to −e, but not intended to be suitable
                 as input for the ed utility, and in the opposite order.

       −r        Apply  diff recursively to files and directories of the same name when file1 and file2 are both
                 directories.

                 The diff utility shall detect infinite loops; that is, entering a previously visited  directory
                 that is an ancestor of the last file encountered.  When it detects an infinite loop, diff shall
                 write a diagnostic message to standard error and shall  either  recover  its  position  in  the
                 hierarchy or terminate.

       −u        Produce output in a form that provides three lines of unified context.

       −U n      Produce output in a form that provides n lines of unified context (where n shall be interpreted
                 as a non-negative decimal integer).

OPERANDS

       The following operands shall be supported:

       file1, file2
                 A pathname of a file to be compared. If either the file1 or file2 operand is '−', the  standard
                 input shall be used in its place.

       If  both  file1  and file2 are directories, diff shall not compare block special files, character special
       files, or FIFO special files to any files and shall not compare regular files  to  directories.   Further
       details  are  as specified in Diff Directory Comparison Format.  The behavior of diff on other file types
       is implementation-defined when found in directories.

       If only one of file1 and file2 is a directory, diff shall be applied to the non-directory  file  and  the
       file  contained  in the directory file with a filename that is the same as the last component of the non-
       directory file.

STDIN

       The standard input shall be used only if one of the file1 or file2 operands  references  standard  input.
       See the INPUT FILES section.

INPUT FILES

       The input files may be of any type.

ENVIRONMENT VARIABLES

       The following environment variables shall affect the execution of diff:

       LANG      Provide a default value for the internationalization variables that are unset or null. (See the
                 Base Definitions volume of POSIX.1‐2008, Section 8.2, Internationalization  Variables  for  the
                 precedence   of   internationalization  variables  used  to  determine  the  values  of  locale
                 categories.)

       LC_ALL    If set to a non-empty string value, override the values of all the  other  internationalization
                 variables.

       LC_CTYPE  Determine  the  locale  for the interpretation of sequences of bytes of text data as characters
                 (for example, single-byte as opposed to multi-byte characters in arguments and input files).

       LC_MESSAGES
                 Determine the locale that should be used to  affect  the  format  and  contents  of  diagnostic
                 messages written to standard error and informative messages written to standard output.

       LC_TIME   Determine  the  locale  for  affecting the format of file timestamps written with the −C and −c
                 options.

       NLSPATH   Determine the location of message catalogs for the processing of LC_MESSAGES.

       TZ        Determine the timezone used for calculating file timestamps written with a context  format.  If
                 TZ is unset or null, an unspecified default timezone shall be used.

ASYNCHRONOUS EVENTS

       Default.

STDOUT

   Diff Directory Comparison Format
       If both file1 and file2 are directories, the following output formats shall be used.

       In  the  POSIX  locale,  each  file  that  is  present  in only one directory shall be reported using the
       following format:

           "Only in %s: %s\n", <directory pathname>, <filename>

       In the POSIX locale, subdirectories that are common to the two  directories  may  be  reported  with  the
       following format:

           "Common subdirectories: %s and %s\n", <directory1 pathname>,
               <directory2 pathname>

       For  each  file  common to the two directories, if the two files are not to be compared: if the two files
       have the same device ID and file serial number, or are both block special files that refer  to  the  same
       device, or are both character special files that refer to the same device, in the POSIX locale the output
       format is unspecified.  Otherwise, in the POSIX locale an unspecified format shall be used that  contains
       the pathnames of the two files.

       For each file common to the two directories, if the files are compared and are identical, no output shall
       be written. If the two files differ, the following format is written:

           "diff %s %s %s\n", <diff_options>, <filename1>, <filename2>

       where <diff_options> are the options as specified on the command line.

       All directory pathnames listed in this section shall be relative to the original command line  arguments.
       All other names of files listed in this section shall be filenames (pathname components).

   Diff Binary Output Format
       In the POSIX locale, if one or both of the files being compared are not text files, it is implementation-
       defined whether diff uses the binary file output format or the other  formats  as  specified  below.  The
       binary  file  output  format  shall  contain  the  pathnames  of  two files being compared and the string
       "differ".

       If both files being compared are text files, depending on the options specified,  one  of  the  following
       formats shall be used to write the differences.

   Diff Default Output Format
       The  default (without −e, −f, −c, −C, −u, or −U options) diff utility output shall contain lines of these
       forms:

           "%da%d\n", <num1>, <num2>

           "%da%d,%d\n", <num1>, <num2>, <num3>

           "%dd%d\n", <num1>, <num2>

           "%d,%dd%d\n", <num1>, <num2>, <num3>

           "%dc%d\n", <num1>, <num2>

           "%d,%dc%d\n", <num1>, <num2>, <num3>

           "%dc%d,%d\n", <num1>, <num2>, <num3>

           "%d,%dc%d,%d\n", <num1>, <num2>, <num3>, <num4>

       These lines resemble ed subcommands to convert file1 into file2.  The  line  numbers  before  the  action
       letters  shall  pertain  to  file1;  those after shall pertain to file2.  Thus, by exchanging a for d and
       reading the line in reverse order, one can also determine how to convert file2 into  file1.   As  in  ed,
       identical pairs (where num1= num2) are abbreviated as a single number.

       Following  each  of these lines, diff shall write to standard output all lines affected in the first file
       using the format:

           "< %s", <line>

       and all lines affected in the second file using the format:

           "> %s", <line>

       If there are lines affected in both file1 and file2 (as with the c subcommand), the changes are separated
       with a line consisting of three <hyphen> characters:

           "−−−\n"

   Diff −e Output Format
       With  the  −e  option, a script shall be produced that shall, when provided as input to ed, along with an
       appended w (write) command, convert file1 into file2.  Only the a (append), c  (change),  d  (delete),  i
       (insert),  and  s  (substitute)  commands  of  ed  shall be used in this script. Text lines, except those
       consisting of the single character <period> ('.'), shall be output as they appear in the file.

   Diff −f Output Format
       With the −f option, an alternative format of script shall be produced. It is similar to that produced  by
       −e, with the following differences:

        1. It  is expressed in reverse sequence; the output of −e orders changes from the end of the file to the
           beginning; the −f from beginning to end.

        2. The command form <lines> <command-letter> used by −e is reversed. For example, 10c with −e  would  be
           c10 with −f.

        3. The form used for ranges of line numbers is <space>-separated, rather than <comma>-separated.

   Diff −c or −C Output Format
       With  the −c or −C option, the output format shall consist of affected lines along with surrounding lines
       of context. The affected lines shall show which ones need to be deleted or changed in  file1,  and  those
       added  from file2.  With the −c option, three lines of context, if available, shall be written before and
       after the affected lines. With the −C option, the user can specify how many lines of context are written.
       The exact format follows.

       The name and last modification time of each file shall be output in the following format:

           "*** %s %s\n", file1, <file1 timestamp>
           "−−− %s %s\n", file2, <file2 timestamp>

       Each  <file>  field  shall be the pathname of the corresponding file being compared. The pathname written
       for standard input is unspecified.

       In the POSIX locale, each <timestamp> field shall be equivalent to the output from the following command:

           date "+%a %b %e %T %Y"

       without the trailing <newline>, executed at the time of last modification of the corresponding  file  (or
       the current time, if the file is standard input).

       Then, the following output formats shall be applied for every set of changes.

       First, a line shall be written in the following format:

           "***************\n"

       Next,  the  range of lines in file1 shall be written in the following format if the range contains two or
       more lines:

           "*** %d,%d ****\n", <beginning line number>, <ending line number>

       and the following format otherwise:

           "*** %d ****\n", <ending line number>

       The ending line number of an empty range shall be the number of the preceding line, or 0 if the range  is
       at the start of the file.

       Next,  the  affected  lines  along  with lines of context (unaffected lines) shall be written. Unaffected
       lines shall be written in the following format:

           "  %s", <unaffected_line>

       Deleted lines shall be written as:

           "− %s", <deleted_line>

       Changed lines shall be written as:

           "! %s", <changed_line>

       Next, the range of lines in file2 shall be written in the following format if the range contains  two  or
       more lines:

           "−−− %d,%d −−−−\n", <beginning line number>, <ending line number>

       and the following format otherwise:

           "−−− %d −−−−\n", <ending line number>

       Then,  lines  of  context  and changed lines shall be written as described in the previous formats. Lines
       added from file2 shall be written in the following format:

           "+ %s", <added_line>

   Diff −u or −U Output Format
       The −u or −U options behave like the −c or −C options, except that the context lines  are  not  repeated;
       instead,  the  context,  deleted,  and  added  lines  are  shown together, interleaved.  The exact format
       follows.

       The name and last modification time of each file shall be output in the following format:

           "--- %s%s%s %s0, file1, <file1 timestamp>, <file1 frac>, <file1 zone>
           "+++ %s%s%s %s0, file2, <file2 timestamp>, <file2 frac>, <file2 zone>

       Each <file> field shall be the pathname of the corresponding file being compared, or the single character
       '−'  if standard input is being compared. However, if the pathname contains a <tab> or a <newline>, or if
       it does not consist entirely of characters taken  from  the  portable  character  set,  the  behavior  is
       implementation-defined.

       Each <timestamp> field shall be equivalent to the output from the following command:

           date '+%Y-%m-%d %H:%M:%S'

       without  the  trailing <newline>, executed at the time of last modification of the corresponding file (or
       the current time, if the file is standard input).

       Each <frac> field shall be either empty, or a decimal point followed  by  at  least  one  decimal  digit,
       indicating  the  fractional-seconds  part (if any) of the file timestamp. The number of fractional digits
       shall be at least the number needed to represent the file's timestamp without loss of information.

       Each <zone> field shall be of the form "shhmm", where "shh" is a signed two-digit decimal number  in  the
       range  −24  through +25, and "mm" is an unsigned two-digit decimal number in the range 00 through 59.  It
       represents the timezone of the timestamp as the number of hours (hh) and minutes (mm) east  (+)  or  west
       (−)  of  UTC for the timestamp.  If the hours and minutes are both zero, the sign shall be '+'.  However,
       if the timezone is not an integral number of minutes away from UTC, the <zone> field  is  implementation-
       defined.

       Then, the following output formats shall be applied for every set of changes.

       First, the range of lines in each file shall be written in the following format:

           "@@ -%s +%s @@", <file1 range>, <file2 range>

       Each <range> field shall be of the form:

           "%1d", <beginning line number>

       if the range contains exactly one line, and:

           "%1d,%1d", <beginning line number>, <number of lines>

       otherwise. If a range is empty, its beginning line number shall be the number of the line just before the
       range, or 0 if the empty range starts the file.

       Next, the affected lines along with lines of context shall be written.  Each  non-empty  unaffected  line
       shall be written in the following format:

           " %s", <unaffected_line>

       where  the  contents  of  the  unaffected  line  shall be taken from file1.  It is implementation-defined
       whether an empty unaffected line is written as an empty line  or  a  line  containing  a  single  <space>
       character.  This  line  also  represents  the  same  line  of file2, even though file2's line may contain
       different contents due to the −b.  Deleted lines shall be written as:

           "-%s", <deleted_line>

       Added lines shall be written as:

           "+%s", <added_line>

       The order of lines written shall be the same as that of the corresponding  file.  A  deleted  line  shall
       never be written immediately after an added line.

       If  −U  n  is specified, the output shall contain no more than n consecutive unaffected lines; and if the
       output contains an affected line and this line is adjacent to up to n consecutive unaffected lines in the
       corresponding file, the output shall contain these unaffected lines.  −u shall act like −U3.

STDERR

       The standard error shall be used only for diagnostic messages.

OUTPUT FILES

       None.

EXTENDED DESCRIPTION

       None.

EXIT STATUS

       The following exit values shall be returned:

        0    No differences were found.

        1    Differences were found.

       >1    An error occurred.

CONSEQUENCES OF ERRORS

       Default.

       The following sections are informative.

APPLICATION USAGE

       If  lines  at  the  end  of  a file are changed and other lines are added, diff output may show this as a
       delete and add, as a change, or as a change and add; diff is not expected  to  know  which  happened  and
       users  should not care about the difference in output as long as it clearly shows the differences between
       the files.

EXAMPLES

       If dir1 is a directory containing a directory named x, dir2 is a directory containing a  directory  named
       x, dir1/x and dir2/x both contain files named date.out, and dir2/x contains a file named y, the command:

           diff −r dir1 dir2

       could produce output similar to:

           Common subdirectories: dir1/x and dir2/x
           Only in dir2/x: y
           diff −r dir1/x/date.out dir2/x/date.out
           1c1
           < Mon Jul  2 13:12:16 PDT 1990
           −−−
           > Tue Jun 19 21:41:39 PDT 1990

RATIONALE

       The  −h  option  was  omitted  because  it  was insufficiently specified and does not add to applications
       portability.

       Historical implementations employ algorithms that do not always produce a minimum  list  of  differences;
       the  current  language about making every effort is the best this volume of POSIX.1‐2008 can do, as there
       is no metric that could be employed to judge the quality of implementations  against  any  and  all  file
       contents.  The  statement  ``This  list  should be minimal'' clearly implies that implementations are not
       expected to provide the following output when comparing two  100-line  files  that  differ  in  only  one
       character on a single line:

           1,100c1,100
           all 100 lines from file1 preceded with "< "
           −−−
           all 100 lines from file2 preceded with "> "

       The  ``Only  in''  messages  required  when  the  −r  option is specified are not used by most historical
       implementations if the −e option is also specified. It  is  required  here  because  it  provides  useful
       information that must be provided to update a target directory hierarchy to match a source hierarchy. The
       ``Common subdirectories'' messages are written by System V and 4.3 BSD when the −r option  is  specified.
       They  are allowed here but are not required because they are reporting on something that is the same, not
       reporting a difference, and are not needed to update a target hierarchy.

       The −c option, which writes output in a format using lines of context, has been included. The  format  is
       useful for a variety of reasons, among them being much improved readability and the ability to understand
       difference changes when the target file has line numbers that differ from another similar,  but  slightly
       different, copy. The patch utility is most valuable when working with difference listings using a context
       format. The BSD version of −c takes an optional argument specifying the amount of  context.  Rather  than
       overloading  −c  and  breaking the Utility Syntax Guidelines for diff, the standard developers decided to
       add a separate option for specifying a context diff with a specified amount of context (−C).   Also,  the
       format  for  context  diffs  was  extended  slightly in 4.3 BSD to allow multiple changes that are within
       context lines from each other to be merged together.  The  output  format  contains  an  additional  four
       <asterisk> characters after the range of affected lines in the first filename. This was to provide a flag
       for old programs (like old versions of patch) that only understand the old context format. The version of
       context described here does not require that multiple changes within context lines be merged, but it does
       not prohibit it either. The extension is upwards-compatible, so any vendors that wish to retain  the  old
       version  of  diff  can  do  so  by  adding  the extra four <asterisk> characters (that is, utilities that
       currently use diff and understand the new merged format will also understand the old unmerged format, but
       not vice versa).

       The  −u  and  −U  options of GNU diff have been included. Their output format, designed by Wayne Davison,
       takes up less space than −c and −C format, and in many cases is easier to read. The  format's  timestamps
       do not vary by locale, so LC_TIME does not affect it. The format's line numbers are rendered with the %1d
       format, not %d, because the file format notation rules would allow extra  <blank>  characters  to  appear
       around the numbers.

       The  substitute  command  was  added as an additional format for the −e option. This was added to provide
       implementations with a way to fix the classic ``dot alone on a line'' bug present  in  many  versions  of
       diff.  Since many implementations have fixed this bug, the standard developers decided not to standardize
       broken behavior, but rather to provide the necessary tool for fixing the bug. One way to fix this bug  is
       to  output two periods whenever a lone period is needed, then terminate the append command with a period,
       and then use the substitute command to convert the two periods into one period.

       The BSD-derived −r option was added to provide a mechanism for using diff  to  compare  two  file  system
       trees.  This  behavior  is  useful,  is  standard  practice on all BSD-derived systems, and is not easily
       reproducible with the find utility.

       The requirement that diff not compare files in some circumstances, even though they have the  same  name,
       is  based  on  the  actual  output  of  historical implementations.  The specified behavior precludes the
       problems arising from running into FIFOs and other files that would cause diff to hang waiting for  input
       with  no  indication  to  the  user that diff was hung. An earlier version of this standard specified the
       output format more precisely, but in practice this requirement was widely  ignored  and  the  benefit  of
       standardization  seemed  small,  so  it is now unspecified. In most common usage, diff −r should indicate
       differences in the file hierarchies, not the  difference  of  contents  of  devices  pointed  to  by  the
       hierarchies.

       Many  early  implementations  of  diff  require  seekable  files.  Since  the System Interfaces volume of
       POSIX.1‐2008 supports  named  pipes,  the  standard  developers  decided  that  such  a  restriction  was
       unreasonable.  Note also that the allowed filename  almost always refers to a pipe.

       No  directory  search  order is specified for diff.  The historical ordering is, in fact, not optimal, in
       that it prints out all of the differences at the current level, including the statements about all common
       subdirectories before recursing into those subdirectories.

       The message:

           "diff %s %s %s\n", <diff_options>, <filename1>, <filename2>

       does not vary by locale because it is the representation of a command, not an English sentence.

FUTURE DIRECTIONS

       None.

SEE ALSO

       cmp, comm, ed, find

       The  Base  Definitions  volume  of  POSIX.1‐2008, Chapter 8, Environment Variables, Section 12.2, Utility
       Syntax Guidelines

COPYRIGHT

       Portions of this text are reprinted and reproduced in electronic form from IEEE Std 1003.1, 2013 Edition,
       Standard  for  Information Technology -- Portable Operating System Interface (POSIX), The Open Group Base
       Specifications Issue 7, Copyright (C) 2013 by the Institute of Electrical and Electronics Engineers,  Inc
       and  The  Open Group.  (This is POSIX.1-2008 with the 2013 Technical Corrigendum 1 applied.) In the event
       of any discrepancy between this version and the original IEEE and The Open Group Standard,  the  original
       IEEE and The Open Group Standard is the referee document. The original Standard can be obtained online at
       http://www.unix.org/online.html .

       Any typographical or formatting errors that appear in this page are most likely to have  been  introduced
       during   the   conversion  of  the  source  files  to  man  page  format.  To  report  such  errors,  see
       https://www.kernel.org/doc/man-pages/reporting_bugs.html .