Provided by: manpages-posix_2.16-1_all bug

NAME

       file - determine file type

SYNOPSIS

       file [-dh][-M file][-m file] file ...

       file -i [-h] file ...

DESCRIPTION

       The  file  utility  shall  perform  a series of tests in sequence on each specified file in an attempt to
       classify it:

        1. If file does not exist, cannot be read, or its file status could not be determined, the output  shall
           indicate that the file was processed, but that its type could not be determined.

        2. If  the  file  is  not  a regular file, its file type shall be identified.  The file types directory,
           FIFO, socket, block special, and character special shall be identified as such. Other implementation-
           defined file types may also be identified. If file is a symbolic link, by default the link  shall  be
           resolved  and  file  shall test the type of file referenced by the symbolic link.  (See the -h and -i
           options below.)

        3. If the length of file is zero, it shall be identified as an empty file.

        4. The file utility shall examine an initial segment of file and shall make a guess at  identifying  its
           contents  based on position-sensitive tests. (The answer is not guaranteed to be correct; see the -d,
           -M, and -m options below.)

        5. The file utility shall examine file and make a guess at identifying its contents  based  on  context-
           sensitive default system tests. (The answer is not guaranteed to be correct.)

        6. The file shall be identified as a data file.

       If  file  does  not  exist,  cannot be read, or its file status could not be determined, the output shall
       indicate that the file was processed, but that its type could not be determined.

       If file is a symbolic link, by default the link shall be resolved and file shall test the  type  of  file
       referenced by the symbolic link.

OPTIONS

       The  file  utility  shall  conform  to the Base Definitions volume of IEEE Std 1003.1-2001, Section 12.2,
       Utility Syntax Guidelines, except that the order of the -m, -d, and -M options shall be significant.

       The following options shall be supported by the implementation:

       -d     Apply any position-sensitive default system tests and context-sensitive default  system  tests  to
              the file. This is the default if no -M or -m option is specified.

       -h     When  a symbolic link is encountered, identify the file as a symbolic link. If -h is not specified
              and file is a symbolic link that refers to a nonexistent file, file shall identify the file  as  a
              symbolic link, as if -h had been specified.

       -i     If a file is a regular file, do not attempt to classify the type of the file further, but identify
              the file as specified in the STDOUT section.

       -M  file
              Specify  the name of a file containing position-sensitive tests that shall be applied to a file in
              order to classify it (see the EXTENDED DESCRIPTION). No position-sensitive  default  system  tests
              nor  context-sensitive  default  system  tests  shall  be  applied  unless  the  -d option is also
              specified.

       -m  file
              Specify the name of a file containing position-sensitive tests that shall be applied to a file  in
              order to classify it (see the EXTENDED DESCRIPTION).

       If  the  -m  option  is  specified  without specifying the -d option or the -M option, position-sensitive
       default system tests shall be applied after the position-sensitive tests specified by the -m  option.  If
       the -M option is specified with the -d option, the -m option, or both, or the -m option is specified with
       the  -d  option,  the  concatenation  of the position-sensitive tests specified by these options shall be
       applied in the order specified by the appearance of these options. If a -M or -m file option-argument  is
       -, the results are unspecified.

OPERANDS

       The following operand shall be supported:

       file   A pathname of a file to be tested.

STDIN

       Not used.

INPUT FILES

       The file can be any file type.

ENVIRONMENT VARIABLES

       The following environment variables shall affect the execution of file:

       LANG   Provide  a  default  value for the internationalization variables that are unset or null. (See the
              Base Definitions volume of IEEE Std 1003.1-2001, Section 8.2, Internationalization  Variables  for
              the  precedence  of  internationalization  variables  used  to  determine  the  values  of  locale
              categories.)

       LC_ALL If set to a non-empty string value, override the values  of  all  the  other  internationalization
              variables.

       LC_CTYPE
              Determine  the locale for the interpretation of sequences of bytes of text data as characters (for
              example, single-byte as opposed to multi-byte characters in arguments and input files).

       LC_MESSAGES
              Determine the locale that should be used to affect the format and contents of diagnostic  messages
              written to standard error and informative messages written to standard output.

       NLSPATH
              Determine the location of message catalogs for the processing of LC_MESSAGES .

ASYNCHRONOUS EVENTS

       Default.

STDOUT

       In the POSIX locale, the following format shall be used to identify each operand, file specified:

              "%s: %s\n", <file>, <type>

       The  values  for <type> are unspecified, except that in the POSIX locale, if file is identified as one of
       the types listed in the following table, <type> shall contain (but is not limited to)  the  corresponding
       string,  unless  the file is identified by a position-sensitive test specified by a -M or -m option. Each
       space shown in the strings shall be exactly one <space>.

                                          Table: File Utility Output Strings
                       If file is:                              <type> shall contain the  Notes
                                                                string:
                       Nonexistent                              cannot open
                       Block special                            block special             1
                       Character special                        character special         1
                       Directory                                directory                 1
                       FIFO                                     fifo                      1
                       Socket                                   socket                    1
                       Symbolic link                            symbolic link to          1
                       Regular file                             regular file              1,2
                       Empty regular file                       empty                     3
                       Regular file that cannot be read         cannot open               3
                       Executable binary                        executable                4,6
                       ar archive library (see ar)              archive                   4,6
                       Extended cpio format (see pax)           cpio archive              4,6
                       Extended tar format (see ustar in pax)   tar archive               4,6
                       Shell script                             commands text             5,6
                       C-language source                        c program text            5,6
                       FORTRAN source                           fortran program text      5,6
                       Regular file whose type cannot be        data
                       determined

       Notes:

               1. This is a file type test.

               2. This test is applied only if the -i option is specified.

               3. This test is applied only if the -i option is not specified.

               4. This is a position-sensitive default system test.

               5. This is a context-sensitive default system test.

               6. Position-sensitive default system tests and context-sensitive default  system  tests  are  not
                  applied if the -M option is specified unless the -d option is also specified.

       In  the  POSIX  locale,  if  file  is  identified  as  a symbolic link (see the -h option), the following
       alternative output format shall be used:

              "%s: %s %s\n", <file>, <type>, <contents of link>"

       If the file named by the file operand does not exist, cannot be read, or the type of the  file  named  by
       the  file  operand  cannot  be  determined,  this  shall not be considered an error that affects the exit
       status.

STDERR

       The standard error shall be used only for diagnostic messages.

OUTPUT FILES

       None.

EXTENDED DESCRIPTION

       A file specified as an option-argument to the -m or -M options shall contain one position-sensitive  test
       per  line,  which shall be applied to the file. If the test succeeds, the message field of the line shall
       be printed and no further tests shall be applied, with the exception that tests on immediately  following
       lines beginning with a single '>' character shall be applied.

       Each line shall be composed of the following four <blank>-separated fields:

       offset An  unsigned  number  (optionally  preceded  by  a single '>' character) specifying the offset, in
              bytes, of the value in the file that is to be compared against the value field of the line. If the
              file is shorter than the specified offset, the test shall fail.

       If the offset begins with the character '>' , the test contained in the line shall not be applied to  the
       file  unless  the  test on the last line for which the offset did not begin with a '>' was successful. By
       default, the offset shall be interpreted as an unsigned decimal number. With a  leading  0x  or  0X,  the
       offset  shall  be  interpreted  as a hexadecimal number; otherwise, with a leading 0, the offset shall be
       interpreted as an octal number.

       type   The type of the value in the file to be tested. The type shall consist of the  type  specification
              characters  c  , d , f , s , and u , specifying character, signed decimal, floating point, string,
              and unsigned decimal, respectively.

       The type string shall be interpreted as the bytes from the file starting  at  the  specified  offset  and
       including the same number of bytes specified by the value field. If insufficient bytes remain in the file
       past the offset to match the value field, the test shall fail.

       The  type  specification characters d , f , and u can be followed by an optional unsigned decimal integer
       that specifies the number of bytes represented by the type.  The type specification character  f  can  be
       followed  by  an  optional  F  ,  D  , or L , indicating that the value is of type float, double, or long
       double, respectively. The type specification characters d and u can be followed by an optional C , S ,  I
       , or L , indicating that the value is of type char, short, int, or long, respectively.

       The  default  number  of bytes represented by the type specifiers d , f , and u shall correspond to their
       respective C-language types as follows. If the system claims conformance to  the  C-Language  Development
       Utilities  option,  those  specifiers  shall  correspond  to  the  default sizes used in the c99 utility.
       Otherwise, the default sizes shall be implementation-defined.

       For the type specifier characters d and u , the default number of bytes shall correspond to the size of a
       basic integer type of the implementation.  For  these  specifier  characters,  the  implementation  shall
       support  values  of  the optional number of bytes to be converted corresponding to the number of bytes in
       the C-language types char, short, int, or long. These numbers can also be specified by an application  as
       the  characters C , S , I , and L , respectively. The byte order used when interpreting numeric values is
       implementation-defined, but shall correspond to the order in which a constant of the  corresponding  type
       is stored in memory on the system.

       For  the  type  specifier  f , the default number of bytes shall correspond to the number of bytes in the
       basic double precision floating-point data type of the  underlying  implementation.   The  implementation
       shall support values of the optional number of bytes to be converted corresponding to the number of bytes
       in  the  C-language  types  float,  double,  and  long  double. These numbers can also be specified by an
       application as the characters F , D , and L , respectively.

       All type specifiers, except for s , can be followed by a mask specifier of the  form  &number.  The  mask
       value  shall be AND'ed with the value of the input file before the comparison with the value field of the
       line is made. By default, the mask shall be interpreted as an unsigned decimal number. With a leading  0x
       or  0X, the mask shall be interpreted as an unsigned hexadecimal number; otherwise, with a leading 0, the
       mask shall be interpreted as an unsigned octal number.

       The strings byte, short, long, and string shall also be supported as type fields, being interpreted as dC
       , dS , dL , and s , respectively.

       value  The value to be compared with the value from the file.

       If the specifier from the type field is s or string, then interpret the value  as  a  string.  Otherwise,
       interpret  it as a number. If the value is a string, then the test shall succeed only when a string value
       exactly matches the bytes from the file.

       If the value is a string, it can contain the following sequences:

       \character
              The   backslash-escape   sequences   as   specified   in   the   Base   Definitions   volume    of
              IEEE Std 1003.1-2001,  Table  5-1,  Escape Sequences and Associated Actions ( '\\' , '\a' , '\b' ,
              '\f' , '\n' , '\r' , '\t' , '\v' ). The results of using any other character, other than an  octal
              digit, following the backslash are unspecified.

       \octal
              Octal  sequences  that  can  be  used to represent characters with specific coded values. An octal
              sequence shall consist of a backslash followed by the longest  sequence  of  one,  two,  or  three
              octal-digit characters (01234567). If the size of a byte on the system is greater than 9 bits, the
              valid escape sequence used to represent a byte is implementation-defined.

       By  default,  any  value  that  is not a string shall be interpreted as a signed decimal number. Any such
       value, with a leading 0x or 0X, shall be interpreted as an unsigned hexadecimal number; otherwise, with a
       leading zero, the value shall be interpreted as an unsigned octal number.

       If the value is not a string, it can  be  preceded  by  a  character  indicating  the  comparison  to  be
       performed. Permissible characters and the comparisons they specify are as follows:

       =
              The test shall succeed if the value from the file equals the value field.

       <
              The test shall succeed if the value from the file is less than the value field.

       >
              The test shall succeed if the value from the file is greater than the value field.

       &
              The  test  shall  succeed  if all of the set bits in the value field are set in the value from the
              file.

       ^
              The test shall succeed if at least one of the set bits in the value field is not set in the  value
              from the file.

       x
              The  test  shall  succeed  if  the  file  is large enough to contain a value of the type specified
              starting at the offset specified.

       message
              The message to be printed if the test  succeeds.  The  message  shall  be  interpreted  using  the
              notation  for the printf formatting specification; see printf() . If the value field was a string,
              then the value from the file shall be  the  argument  for  the  printf  formatting  specification;
              otherwise, the value from the file shall be the argument.

EXIT STATUS

       The following exit values shall be returned:

        0     Successful completion.

       >0     An error occurred.

CONSEQUENCES OF ERRORS

       Default.

       The following sections are informative.

APPLICATION USAGE

       The  file utility can only be required to guess at many of the file types because only exhaustive testing
       can determine some types with certainty. For example, binary data on some implementations might match the
       initial segment of an executable or a tar archive.

       Note that the table indicates that the output contains the stated string. Systems may add text before  or
       after  the  string.  For executables, as an example, the machine architecture and various facts about how
       the file was link-edited may be included. Note also that on systems that  recognize  shell  script  files
       starting with "#!" as executable files, these may be identified as executable binary files rather than as
       shell scripts.

EXAMPLES

       Determine whether an argument is a binary executable file:

              file "$1" | grep -Fq executable &&
                  printf "%s is executable.\n" "$1"

RATIONALE

       The -f option was omitted because the same effect can (and should) be obtained using the xargs utility.

       Historical  versions of the file utility attempt to identify the following types of files: symbolic link,
       directory, character special, block special, socket, tar archive, cpio  archive,  SCCS  archive,  archive
       library,  empty,  compress  output, pack output, binary data, C source, FORTRAN source, assembler source,
       nroff/ troff/ eqn/ tbl source troff output, shell script, C  shell  script,  English  text,  ASCII  text,
       various executables, APL workspace, compiled terminfo entries, and CURSES screen images. Only those types
       that  are reasonably well specified in POSIX or are directly related to POSIX utilities are listed in the
       table.

       Historical systems have used a "magic file" named /etc/magic to help identify file types. Because  it  is
       generally  useful  for  users  and  scripts  to be able to identify special file types, the -m flag and a
       portable format for user-created magic  files  has  been  specified.  No  requirement  is  made  that  an
       implementation  of  file  use this method of identifying files, only that users be permitted to add their
       own classifying tests.

       In addition, three options have been added to historical practice.  The -d flag has been added to  permit
       users to cause their tests to follow any default system tests. The -i flag has been added to permit users
       to test portably for regular files in shell scripts. The -M flag has been added to permit users to ignore
       any default system tests.

       The  IEEE Std 1003.1-2001 description of default system tests and the interaction between the -d, -M, and
       -m options did not clearly indicate that there were two types of "default system tests".  The  "position-
       sensitive tests'' determine file types by looking for certain string or binary values at specific offsets
       in  the  file being examined. These position-sensitive tests were implemented in historical systems using
       the magic file described above. Some of these tests are now built into the file utility  itself  on  some
       implementations so the output can provide more detail than can be provided by magic files. For example, a
       magic file can easily identify a core file on most implementations, but cannot name the program file that
       dropped the core. A magic file could produce output such as:

              /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1

       but by building the test into the file utility, you could get output such as:

              /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1, from 'testprog'

       These  extended built-in tests are still to be treated as position-sensitive default system tests even if
       they are not listed in /etc/magic or any other magic file.

       The context-sensitive default system tests were always built into the file utility.  These  tests  looked
       for  language  constructs  in text files trying to identify shell scripts, C, FORTRAN, and other computer
       language source files, and even plain text files. With  the  addition  of  the  -m  and  -M  options  the
       distinction  between  position-sensitive  and  context-sensitive  default  system  tests became important
       because the order of testing is important. The context-sensitive system default  tests  should  never  be
       applied  before  any position-sensitive tests even if the -d option is specified before a -m option or -M
       option due to the high probability that the  context-sensitive  system  default  tests  will  incorrectly
       identify  arbitrary  text  files  as text files before position-sensitive tests specified by the -m or -M
       option would be applied to give a more accurate identification.

       Leaving the meaning of -M - and -m - unspecified  allows  an  existing  prototype  of  these  options  to
       continue  to  work in a backwards-compatible manner. (In that implementation, -M - was roughly equivalent
       to -d in IEEE Std 1003.1-2001.)

       The historical -c option was omitted as not particularly useful to users or portable  shell  scripts.  In
       addition,  a  reasonable  implementation  of the file utility would report any errors found each time the
       magic file is read.

       The historical format of the magic file  was  the  same  as  that  specified  by  the  Rationale  in  the
       ISO POSIX-2:1993  standard  for the offset, value, and message fields; however, it used less precise type
       fields than the format specified by the current normative text. The new type field values are a  superset
       of the historical ones.

       The following is an example magic file:

              0  short     070707              cpio archive
              0  short     0143561             Byte-swapped cpio archive
              0  string    070707              ASCII cpio archive
              0  long      0177555             Very old archive
              0  short     0177545             Old archive
              0  short     017437              Old packed data
              0  string    \037\036            Packed data
              0  string    \377\037            Compacted data
              0  string    \037\235            Compressed data
              >2 byte&0x80 >0                  Block compressed
              >2 byte&0x1f x                   %d bits
              0  string    \032\001            Compiled Terminfo Entry
              0  short     0433                Curses screen image
              0  short     0434                Curses screen image
              0  string    <ar>                System V Release 1 archive
              0  string    !<arch>\n__.SYMDEF  Archive random library
              0  string    !<arch>             Archive
              0  string    ARF_BEGARF          PHIGS clear text archive
              0  long      0x137A2950          Scalable OpenFont binary
              0  long      0x137A2951          Encrypted scalable OpenFont binary

       The  use  of  a  basic  integer  data  type is intended to allow the implementation to choose a word size
       commonly used by applications on that architecture.

FUTURE DIRECTIONS

       None.

SEE ALSO

       ar , ls , pax

COPYRIGHT

       Portions of this text are reprinted and reproduced in electronic form from IEEE Std 1003.1, 2003 Edition,
       Standard for Information Technology -- Portable Operating System Interface (POSIX), The Open  Group  Base
       Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of Electrical and Electronics Engineers,
       Inc  and  The  Open Group. In the event of any discrepancy between this version and the original IEEE and
       The Open Group Standard, the original IEEE and The Open Group  Standard  is  the  referee  document.  The
       original Standard can be obtained online at http://www.opengroup.org/unix/online.html .

IEEE/The Open Group                                   2003                                               FILE(P)