Provided by: freebsd-manpages_12.2-1_all bug


     pNFS — NFS Version 4.1 Parallel NFS Protocol


     The NFSv4.1 client and server provides support for the pNFS specification; see Network File
     System (NFS) Version 4 Minor Version 1 Protocol RFC 5661.  A pNFS service separates
     Read/Write operations from all other NFSv4.1 operations, which are referred to as Metadata
     operations.  The Read/Write operations are performed directly on the Data Server (DS) where
     the file's data resides, bypassing the NFS server.  All other file operations are performed
     on the NFS server, which is referred to as a Metadata Server (MDS).  NFS clients that do not
     support pNFS perform Read/Write operations on the MDS, which acts as a proxy for the
     appropriate DS(s).

     The NFSv4.1 protocol provides two pieces of information to pNFS aware clients that allow
     them to perform Read/Write operations directly on the DS.

     The first is DeviceInfo, which is static information defining the DS server.  The critical
     piece of information in DeviceInfo for the layout types supported by FreeBSD is the IP
     address that is used to perform RPCs on the DS.  It also indicates which version of NFS the
     DS supports, I/O size and other layout specific information.  In the DeviceInfo, there is a
     DeviceID which, for the FreeBSD server is unique to the DS configuration and changes
     whenever the nfsd daemon is restarted or the server is rebooted.

     The second is the layout, which is per file and references the DeviceInfo to use via the
     DeviceID.  It is for a byte range of a file and is either Read or Read/Write.  For the
     FreeBSD server, a layout covers all bytes of a file.  A layout may be recalled by the MDS
     using a LayoutRecall callback.  When a client returns a layout via the LayoutReturn
     operation it can indicate that error(s) were encountered while doing I/O on the DS, at least
     for certain layout types such as the Flexible File Layout.

     The FreeBSD client and server supports two layout types.

     The File Layout is described in RFC5661 and uses the NFSv4.1 protocol to perform I/O on the
     DS.  It does not support client aware DS mirroring and, as such, the FreeBSD server only
     provides File Layout support for non-mirrored configurations.

     The Flexible File Layout allows the use of the NFSv3, NFSv4.0 or NFSv4.1 protocol to perform
     I/O on the DS and does support client aware mirroring.  As such, the FreeBSD server uses
     Flexible File Layout layouts for the mirrored DS configurations.  The FreeBSD server
     supports the “tightly coupled” variant and all DSs use the NFSv4.1 protocol for I/O
     operations.  Clients that support the Flexible File Layout will do writes and commits to all
     DS mirrors in the mirror set.

     A FreeBSD pNFS service consists of a single MDS server plus one or more DS servers, all of
     which are FreeBSD systems.  For a non-mirrored configuration, the FreeBSD server will issue
     File Layout layouts by default.  However that default can be set to the Flexible File Layout
     by setting the sysctl(1) sysctl “vfs.nfsd.default_flexfile” to one.  Mirrored server
     configurations will only issue Flexible File Layouts.  pNFS clients mount the MDS as they
     would a single NFS server.

     A FreeBSD pNFS client must be running the nfscbd(8) daemon and use the mount options

     When files are created, the MDS creates a file tree identical to what a single NFS server
     creates, except that all the regular (VREG) files will be empty.  As such, if you look at
     the exported tree on the MDS directly on the MDS server (not via an NFS mount), the files
     will all be of size zero.  Each of these files will also have two extended attributes in the
     system attribute name space:

           pnfsd.dsfile - This extended attrbute stores the information that the
               MDS needs to find the data file on a DS(s) for this file.
           pnfsd.dsattr - This extended attribute stores the Size, AccessTime,
               ModifyTime and Change attributes for the file.

     For each regular (VREG) file, the MDS creates a data file on one (or on N of them for the
     mirrored case, where N is the mirror_level) of the DS(s) where the file's data will be
     stored.  The name of this file is the file handle of the file on the MDS in hexadecimal at
     time of file creation.  The data file will have the same file ownership, mode and NFSv4 ACL
     (if ACLs are enabled for the file system) as the file on the MDS, so that permission
     checking can be done on the DS.  This is referred to as “tightly coupled” for the Flexible
     File Layout.

     For pNFS aware clients, the service generates File Layout or Flexible File Layout layouts
     and associated DeviceInfo.  For non-pNFS aware NFS clients, the pNFS service appears just
     like a normal NFS service.  For the non-pNFS aware client, the MDS will perform I/O
     operations on the appropriate DS(s), acting as a proxy for the non-pNFS aware client.  This
     is also true for NFSv3 and NFSv4.0 mounts, since these are always non-pNFS aware.

     It is possible to assign a DS to an MDS exported file system so that it will store data for
     files on the MDS exported file system.  If a DS is not assigned to an MDS exported file
     system, it will store data for files on all exported file systems on the MDS.

     If mirroring is enabled, the pNFS service will continue to function when DS(s) have failed,
     so long is there is at least one DS still operational that stores data for files on all of
     the MDS exported file systems.  After a disabled mirrored DS is repaired, it is possible to
     recover the DS as a mirror while the pNFS service continues to function.

     See pnfsserver(4) for information on how to set up a FreeBSD pNFS service.


     nfsv4(4), pnfsserver(4), exports(5), fstab(5), rc.conf(5), nfscbd(8), nfsd(8), nfsuserd(8),
     pnfsdscopymr(8), pnfsdsfile(8), pnfsdskill(8)


     Linux kernel versions prior to 4.12 only supports NFSv3 DSs in its client and will do all
     I/O through the MDS.  For Linux 4.12 kernels, support for NFSv4.1 DSs was added, but I have
     seen Linux client crashes when testing this client.  For Linux 4.17-rc2 kernels, I have not
     seen client crashes during testing, but it only supports the “loosely coupled” variant.  To
     make it work correctly when mounting the FreeBSD server, you must either patch the Flexible
     File Layout client driver with a patch like:


     or set the sysctl “vfs.nfsd.flexlinuxhack” to one so that it works around the Linux client
     driver's limitations.

     Since the MDS cannot be mirrored, it is a single point of failure just as a non pNFS server