Provided by: manpages_4.15-1_all bug

NAME

       fanotify - monitoring filesystem events

DESCRIPTION

       The  fanotify  API  provides notification and interception of filesystem events.  Use cases include virus
       scanning and hierarchical storage management.  Currently, only a limited set of events is supported.   In
       particular,  there  is no support for create, delete, and move events.  (See inotify(7) for details of an
       API that does notify those events.)

       Additional capabilities compared to the inotify(7) API include the ability to monitor all of the  objects
       in  a mounted filesystem, the ability to make access permission decisions, and the possibility to read or
       modify files before access by other applications.

       The following system calls are used with this API: fanotify_init(2), fanotify_mark(2), read(2), write(2),
       and close(2).

   fanotify_init(), fanotify_mark(), and notification groups
       The  fanotify_init(2)  system  call  creates and initializes an fanotify notification group and returns a
       file descriptor referring to it.

       An fanotify notification group is a kernel-internal object that holds a list of files,  directories,  and
       mount points for which events shall be created.

       For each entry in an fanotify notification group, two bit masks exist: the mark mask and the ignore mask.
       The mark mask defines file activities for which an event shall  be  created.   The  ignore  mask  defines
       activities  for which no event shall be generated.  Having these two types of masks permits a mount point
       or directory to be marked for receiving events, while at the  same  time  ignoring  events  for  specific
       objects under that mount point or directory.

       The  fanotify_mark(2)  system call adds a file, directory, or mount to a notification group and specifies
       which events shall be reported (or ignored), or removes or modifies such an entry.

       A possible usage of the ignore mask is for a file cache.   Events  of  interest  for  a  file  cache  are
       modification  of  a  file  and  closing of the same.  Hence, the cached directory or mount point is to be
       marked to receive these events.  After receiving the first event informing that a file has been modified,
       the  corresponding  cache entry will be invalidated.  No further modification events for this file are of
       interest until the file is closed.  Hence, the modify event can  be  added  to  the  ignore  mask.   Upon
       receiving  the close event, the modify event can be removed from the ignore mask and the file cache entry
       can be updated.

       The entries in the fanotify notification groups refer to files and directories via their inode number and
       to  mounts  via  their mount ID.  If files or directories are renamed or moved within the same mount, the
       respective entries survive.  If files or directories are deleted or moved to another mount or  if  mounts
       are unmounted, the corresponding entries are deleted.

   The event queue
       As  events  occur  on  the  filesystem  objects  monitored  by  a notification group, the fanotify system
       generates events that are collected in a queue.  These events can then be read (using read(2) or similar)
       from the fanotify file descriptor returned by fanotify_init(2).

       Two  types  of  events are generated: notification events and permission events.  Notification events are
       merely informative and require no action to be taken by the receiving application except for closing  the
       file  descriptor  passed  in  the  event  (see  below).   Permission events are requests to the receiving
       application to decide whether permission for a file access shall  be  granted.   For  these  events,  the
       recipient must write a response which decides whether access is granted or not.

       An  event is removed from the event queue of the fanotify group when it has been read.  Permission events
       that have been read are kept in an internal list of the fanotify group until either a permission decision
       has been taken by writing to the fanotify file descriptor or the fanotify file descriptor is closed.

   Reading fanotify events
       Calling  read(2) for the file descriptor returned by fanotify_init(2) blocks (if the flag FAN_NONBLOCK is
       not specified in the call to  fanotify_init(2))  until  either  a  file  event  occurs  or  the  call  is
       interrupted by a signal (see signal(7)).

       After a successful read(2), the read buffer contains one or more of the following structures:

           struct fanotify_event_metadata {
               __u32 event_len;
               __u8 vers;
               __u8 reserved;
               __u16 metadata_len;
               __aligned_u64 mask;
               __s32 fd;
               __s32 pid;
           };

       For  performance reasons, it is recommended to use a large buffer size (for example, 4096 bytes), so that
       multiple events can be retrieved by a single read(2).

       The return value of read(2) is the number of bytes placed in the buffer, or -1 in case of an  error  (but
       see BUGS).

       The fields of the fanotify_event_metadata structure are as follows:

       event_len
              This  is  the  length  of  the  data for the current event and the offset to the next event in the
              buffer.  In the current implementation, the value of event_len is  always  FAN_EVENT_METADATA_LEN.
              However, the API is designed to allow variable-length structures to be returned in the future.

       vers   This   field   holds   a   version   number   for   the   structure.    It  must  be  compared  to
              FANOTIFY_METADATA_VERSION to verify that the structures returned at runtime match  the  structures
              defined  at compile time.  In case of a mismatch, the application should abandon trying to use the
              fanotify file descriptor.

       reserved
              This field is not used.

       metadata_len
              This is the length of the structure.  The field was introduced to facilitate the implementation of
              optional headers per event type.  No such optional headers exist in the current implementation.

       mask   This is a bit mask describing the event (see below).

       fd     This  is  an  open  file descriptor for the object being accessed, or FAN_NOFD if a queue overflow
              occurred.  The file descriptor can be used to  access  the  contents  of  the  monitored  file  or
              directory.  The reading application is responsible for closing this file descriptor.

              When  calling  fanotify_init(2),  the  caller may specify (via the event_f_flags argument) various
              file status flags that are to be set on the open file description that corresponds  to  this  file
              descriptor.  In addition, the (kernel-internal) FMODE_NONOTIFY file status flag is set on the open
              file description.  This flag suppresses fanotify event generation.  Hence, when  the  receiver  of
              the  fanotify  event  accesses  the  notified  file  or  directory  using this file descriptor, no
              additional events will be created.

       pid    This is the ID of the process that caused the event.  A program listening to fanotify  events  can
              compare this PID to the PID returned by getpid(2), to determine whether the event is caused by the
              listener itself, or is due to a file access by another process.

       The bit mask in mask indicates which events have occurred for a single filesystem object.  Multiple  bits
       may  be  set  in  this  mask,  if  more  than one event occurred for the monitored filesystem object.  In
       particular, consecutive events for the same filesystem object and originating from the same  process  may
       be  merged  into  a single event, with the exception that two permission events are never merged into one
       queue entry.

       The bits that may appear in mask are as follows:

       FAN_ACCESS
              A file or a directory (but see BUGS) was accessed (read).

       FAN_OPEN
              A file or a directory was opened.

       FAN_MODIFY
              A file was modified.

       FAN_CLOSE_WRITE
              A file that was opened for writing (O_WRONLY or O_RDWR) was closed.

       FAN_CLOSE_NOWRITE
              A file or directory that was opened read-only (O_RDONLY) was closed.

       FAN_Q_OVERFLOW
              The event queue exceeded the limit of 16384 entries.  This limit can be overridden  by  specifying
              the FAN_UNLIMITED_QUEUE flag when calling fanotify_init(2).

       FAN_ACCESS_PERM
              An  application  wants  to read a file or directory, for example using read(2) or readdir(2).  The
              reader must write a response (as described below) that determines whether the permission to access
              the filesystem object shall be granted.

       FAN_OPEN_PERM
              An  application  wants  to  open  a  file  or  directory.   The  reader must write a response that
              determines whether the permission to open the filesystem object shall be granted.

       To check for any close event, the following bit mask may be used:

       FAN_CLOSE
              A file was closed.  This is a synonym for:

                  FAN_CLOSE_WRITE | FAN_CLOSE_NOWRITE

       The following macros are provided to iterate over a buffer containing fanotify event metadata returned by
       a read(2) from an fanotify file descriptor:

       FAN_EVENT_OK(meta, len)
              This  macro  checks the remaining length len of the buffer meta against the length of the metadata
              structure and the event_len field of the first metadata structure in the buffer.

       FAN_EVENT_NEXT(meta, len)
              This macro uses the length indicated in the event_len field of the metadata structure  pointed  to
              by  meta  to  calculate  the address of the next metadata structure that follows meta.  len is the
              number of bytes of metadata that currently remain in the buffer.  The macro returns a  pointer  to
              the  next  metadata  structure  that  follows  meta, and reduces len by the number of bytes in the
              metadata structure that has been skipped over (i.e., it subtracts meta->event_len from len).

       In addition, there is:

       FAN_EVENT_METADATA_LEN
              This macro returns the size (in bytes) of the  structure  fanotify_event_metadata.   This  is  the
              minimum size (and currently the only size) of any event metadata.

   Monitoring an fanotify file descriptor for events
       When  an  fanotify  event  occurs,  the  fanotify  file  descriptor  indicates as readable when passed to
       epoll(7), poll(2), or select(2).

   Dealing with permission events
       For permission events, the application must write(2) a structure of the following form  to  the  fanotify
       file descriptor:

           struct fanotify_response {
               __s32 fd;
               __u32 response;
           };

       The fields of this structure are as follows:

       fd     This is the file descriptor from the structure fanotify_event_metadata.

       response
              This  field  indicates  whether  or not the permission is to be granted.  Its value must be either
              FAN_ALLOW to allow the file operation or FAN_DENY to deny the file operation.

       If access is denied, the requesting application call will receive an EPERM error.

   Closing the fanotify file descriptor
       When all file descriptors referring to the fanotify notification group are closed, the fanotify group  is
       released  and  its  resources  are  freed for reuse by the kernel.  Upon close(2), outstanding permission
       events will be set to allowed.

   /proc/[pid]/fdinfo
       The file /proc/[pid]/fdinfo/[fd] contains information about fanotify marks  for  file  descriptor  fd  of
       process pid.  See proc(5) for details.

ERRORS

       In  addition  to  the  usual  errors  for  read(2),  the following errors can occur when reading from the
       fanotify file descriptor:

       EINVAL The buffer is too small to hold the event.

       EMFILE The per-process limit on the number of open files  has  been  reached.   See  the  description  of
              RLIMIT_NOFILE in getrlimit(2).

       ENFILE The  system-wide limit on the total number of open files has been reached.  See /proc/sys/fs/file-
              max in proc(5).

       ETXTBSY
              This error is returned by read(2) if  O_RDWR  or  O_WRONLY  was  specified  in  the  event_f_flags
              argument  when  calling  fanotify_init(2)  and  an  event  occurred  for  a monitored file that is
              currently being executed.

       In addition to the usual errors for write(2), the following errors can occur when writing to the fanotify
       file descriptor:

       EINVAL Fanotify  access  permissions are not enabled in the kernel configuration or the value of response
              in the response structure is not valid.

       ENOENT The file descriptor fd in the response structure is not valid.  This may occur when a response for
              the permission event has already been written.

VERSIONS

       The  fanotify  API  was  introduced  in version 2.6.36 of the Linux kernel and enabled in version 2.6.37.
       Fdinfo support was added in version 3.8.

CONFORMING TO

       The fanotify API is Linux-specific.

NOTES

       The fanotify API is available only if the kernel was built with the CONFIG_FANOTIFY configuration  option
       enabled.     In    addition,    fanotify    permission    handling    is    available    only    if   the
       CONFIG_FANOTIFY_ACCESS_PERMISSIONS configuration option is enabled.

   Limitations and caveats
       Fanotify reports only events that a user-space program triggers through the filesystem API.  As a result,
       it does not catch remote events that occur on network filesystems.

       The  fanotify  API  does  not  report  file accesses and modifications that may occur because of mmap(2),
       msync(2), and munmap(2).

       Events for directories are created only if the directory itself is opened,  read,  and  closed.   Adding,
       removing,  or  changing children of a marked directory does not create events for the monitored directory
       itself.

       Fanotify monitoring of directories is  not  recursive:  to  monitor  subdirectories  under  a  directory,
       additional  marks  must  be created.  (But note that the fanotify API provides no way of detecting when a
       subdirectory has been created under a marked directory,  which  makes  recursive  monitoring  difficult.)
       Monitoring mounts offers the capability to monitor a whole directory tree.

       The event queue can overflow.  In this case, events are lost.

BUGS

       Before  Linux  3.19,  fallocate(2)  did  not  generate  fanotify  events.   Since  Linux  3.19,  calls to
       fallocate(2) generate FAN_MODIFY events.

       As of Linux 3.17, the following bugs exist:

       *  On Linux, a filesystem object may be accessible through multiple paths,  for  example,  a  part  of  a
          filesystem  may be remounted using the --bind option of mount(8).  A listener that marked a mount will
          be notified only of events that were triggered for a filesystem object  using  the  same  mount.   Any
          other event will pass unnoticed.

       *  When  an  event is generated, no check is made to see whether the user ID of the receiving process has
          authorization to read or write the file before passing a file descriptor for that file.  This poses  a
          security risk, when the CAP_SYS_ADMIN capability is set for programs executed by unprivileged users.

       *  If a call to read(2) processes multiple events from the fanotify queue and an error occurs, the return
          value will be the total length of the events successfully copied to the user-space buffer  before  the
          error  occurred.   The  return  value  will  not  be -1, and errno will not be set.  Thus, the reading
          application has no way to detect the error.

EXAMPLE

       The following program demonstrates the usage of the fanotify API.  It marks the mount point passed  as  a
       command-line  argument and waits for events of type FAN_PERM_OPEN and FAN_CLOSE_WRITE.  When a permission
       event occurs, a FAN_ALLOW response is given.

       The following output was recorded while editing the file  /home/user/temp/notes.   Before  the  file  was
       opened,  a  FAN_OPEN_PERM  event  occurred.  After the file was closed, a FAN_CLOSE_WRITE event occurred.
       Execution of the program ends when the user presses the ENTER key.

   Example output
           # ./fanotify_example /home
           Press enter key to terminate.
           Listening for events.
           FAN_OPEN_PERM: File /home/user/temp/notes
           FAN_CLOSE_WRITE: File /home/user/temp/notes

           Listening for events stopped.

   Program source

       #define _GNU_SOURCE     /* Needed to get O_LARGEFILE definition */
       #include <errno.h>
       #include <fcntl.h>
       #include <limits.h>
       #include <poll.h>
       #include <stdio.h>
       #include <stdlib.h>
       #include <sys/fanotify.h>
       #include <unistd.h>

       /* Read all available fanotify events from the file descriptor 'fd' */

       static void
       handle_events(int fd)
       {
           const struct fanotify_event_metadata *metadata;
           struct fanotify_event_metadata buf[200];
           ssize_t len;
           char path[PATH_MAX];
           ssize_t path_len;
           char procfd_path[PATH_MAX];
           struct fanotify_response response;

           /* Loop while events can be read from fanotify file descriptor */

           for(;;) {

               /* Read some events */

               len = read(fd, (void *) &buf, sizeof(buf));
               if (len == -1 && errno != EAGAIN) {
                   perror("read");
                   exit(EXIT_FAILURE);
               }

               /* Check if end of available data reached */

               if (len <= 0)
                   break;

               /* Point to the first event in the buffer */

               metadata = buf;

               /* Loop over all events in the buffer */

               while (FAN_EVENT_OK(metadata, len)) {

                   /* Check that run-time and compile-time structures match */

                   if (metadata->vers != FANOTIFY_METADATA_VERSION) {
                       fprintf(stderr,
                               "Mismatch of fanotify metadata version.\n");
                       exit(EXIT_FAILURE);
                   }

                   /* metadata->fd contains either FAN_NOFD, indicating a
                      queue overflow, or a file descriptor (a nonnegative
                      integer). Here, we simply ignore queue overflow. */

                   if (metadata->fd >= 0) {

                       /* Handle open permission event */

                       if (metadata->mask & FAN_OPEN_PERM) {
                           printf("FAN_OPEN_PERM: ");

                           /* Allow file to be opened */

                           response.fd = metadata->fd;
                           response.response = FAN_ALLOW;
                           write(fd, &response,
                                 sizeof(struct fanotify_response));
                       }

                       /* Handle closing of writable file event */

                       if (metadata->mask & FAN_CLOSE_WRITE)
                           printf("FAN_CLOSE_WRITE: ");

                       /* Retrieve and print pathname of the accessed file */

                       snprintf(procfd_path, sizeof(procfd_path),
                                "/proc/self/fd/%d", metadata->fd);
                       path_len = readlink(procfd_path, path,
                                           sizeof(path) - 1);
                       if (path_len == -1) {
                           perror("readlink");
                           exit(EXIT_FAILURE);
                       }

                       path[path_len] = '\0';
                       printf("File %s\n", path);

                       /* Close the file descriptor of the event */

                       close(metadata->fd);
                   }

                   /* Advance to next event */

                   metadata = FAN_EVENT_NEXT(metadata, len);
               }
           }
       }

       int
       main(int argc, char *argv[])
       {
           char buf;
           int fd, poll_num;
           nfds_t nfds;
           struct pollfd fds[2];

           /* Check mount point is supplied */

           if (argc != 2) {
               fprintf(stderr, "Usage: %s MOUNT\n", argv[0]);
               exit(EXIT_FAILURE);
           }

           printf("Press enter key to terminate.\n");

           /* Create the file descriptor for accessing the fanotify API */

           fd = fanotify_init(FAN_CLOEXEC | FAN_CLASS_CONTENT | FAN_NONBLOCK,
                              O_RDONLY | O_LARGEFILE);
           if (fd == -1) {
               perror("fanotify_init");
               exit(EXIT_FAILURE);
           }

           /* Mark the mount for:
              - permission events before opening files
              - notification events after closing a write-enabled
                file descriptor */

           if (fanotify_mark(fd, FAN_MARK_ADD | FAN_MARK_MOUNT,
                             FAN_OPEN_PERM | FAN_CLOSE_WRITE, AT_FDCWD,
                             argv[1]) == -1) {
               perror("fanotify_mark");
               exit(EXIT_FAILURE);
           }

           /* Prepare for polling */

           nfds = 2;

           /* Console input */

           fds[0].fd = STDIN_FILENO;
           fds[0].events = POLLIN;

           /* Fanotify input */

           fds[1].fd = fd;
           fds[1].events = POLLIN;

           /* This is the loop to wait for incoming events */

           printf("Listening for events.\n");

           while (1) {
               poll_num = poll(fds, nfds, -1);
               if (poll_num == -1) {
                   if (errno == EINTR)     /* Interrupted by a signal */
                       continue;           /* Restart poll() */

                   perror("poll");         /* Unexpected error */
                   exit(EXIT_FAILURE);
               }

               if (poll_num > 0) {
                   if (fds[0].revents & POLLIN) {

                       /* Console input is available: empty stdin and quit */

                       while (read(STDIN_FILENO, &buf, 1) > 0 && buf != '\n')
                           continue;
                       break;
                   }

                   if (fds[1].revents & POLLIN) {

                       /* Fanotify events are available */

                       handle_events(fd);
                   }
               }
           }

           printf("Listening for events stopped.\n");
           exit(EXIT_SUCCESS);
       }

SEE ALSO

       fanotify_init(2), fanotify_mark(2), inotify(7)

COLOPHON

       This page is part of release 4.15 of the Linux man-pages project.  A description of the project,
       information about reporting bugs, and the latest version of this page, can be found at
       https://www.kernel.org/doc/man-pages/.