lunar (1) makepp_build_cache.1.gz

Provided by: makepp_2.0.98.5-2.1_all bug

NAME

       makepp_build_cache -- How to set up and use build caches

DESCRIPTION

       C: clean,
         create,  M: makepp_build_cache_control,
         mppbcc,  S: show,
         stats

       A build cache is a directory containing copies of previous targets that makepp already
       built.  When makepp is asked to build a new target, it sees if it has already built it
       somewhere else under the same conditions, and if so, simply links or copies it instead of
       rebuilding it.

       A build cache can be useful in the following circumstances:

       •   You are working on a program and you compile it optimized.  Then you discover a bug,
           and recompile the whole thing in debug mode.  You find the bug and you now want to
           recompile it in optimized mode.  Most of the files will be identical.  If you used a
           build cache in all of your compilations, makepp will simply pull the unchanged files
           out of the build cache rather than recompiling them.

           A similar situation is if you normally work on one architecture but briefly switch to
           a different architecture, and then you switch back.  If the old files are still in the
           build cache, makepp will not have to recompile anything.

       •   You have checked out several copies of a particular program from your version control
           system, and have made different changes to each directory hierarchy.  (E.g., you are
           solving different bugs in different directory hierarchies.)  Most of the files will be
           identical in the two directory hierarchies.  If you build both with a build cache, the
           build in the second directory hierarchy will be able to simply copy the files from the
           build cache rather than recompiling files that are the same.

       •   You have several developers working on the same set of sources.  Each developer is
           making changes, but most of the files are identical between developers.  If all the
           developers share a build cache, then if one developer's build compiles a file, any
           other developer's build which has to compile the identical file (with the same
           includes, etc.) can just copy the cached file instead of rerunning the compilation.

       A build cache can help if all of the following are true:

       •   You have plenty of disk space.  Usually makepp will wind up caching many copies of
           each file that is changing, because it has no idea which ones will actually be used.
           You can turn off the build cache for certain files, but if the build cache is going to
           be useful at all, it will probably have to have a lot of files in it.

       •   Your files take noticeably longer to build than to copy.  If the build cache is on the
           same file system, makepp will try to use hard links rather than copying the file.
           Makepp has to link or copy the file into the cache when the file is built, and then it
           has to link or copy the file from the cache when it is required again.  Furthermore,
           there is a small overhead involved in checking whether the needed file is actually in
           the build cache, and copying the build information about the file as well as the file
           itself.

           You may find, for example, that using a build cache isn't worth it for compiling very
           small modules.  It's almost certainly not worth it for commands to make a static
           library (an archive file, libxyz.a), except if you use links to save disk space.

       •   There is a high probability that some files will be needed again in another
           compilation.  If you are only compiling a piece of software once, build caches can
           only slow things down.

       Using a build cache requires a little bit of setup and maintenance work.  Please do not
       try using a build cache until you understand how they work, how to create them, and how to
       keep them from continually growing and eating up all of the available disk space on your
       system.

   How a build cache works
       If you enable a build cache, every time a file is built, makepp stores a copy away in a
       build cache.  The name of the file is a key that is a hash of the checksums of all the
       inputs and the build command and the architecture.  The next time makepp wants to rebuild
       the file, it sees if there is a file with the same checksums already in the build cache.
       If so, the file is copied out of the build cache.

       For efficiency, if the build cache is located on the same file system as the build, makepp
       will not actually copy the file; instead, it will make a hard link.  This is faster and
       doesn't use up any extra disk space.  Similarly, when makepp wants to pull a file out of
       the build cache, it will use a hard link if possible, or copy it if necessary.

       WARNING: Makepp never deletes files from a build cache unless it is explicitly asked.
       This means that your build caches will continue to grow without bounds unless you clean
       them up periodically (see below for details).

       Build caches and repositories

       Build caches and repositories (see makepp_repositories) can solve similar problems.  For
       some situations, a repository is more appropriate, while for others, a build cache is more
       appropriate.

       You can also combine the two.  If you have a huge directory structure with lots of
       sources, which you don't want every developer to have a copy of, then you can provide them
       as a repository.  The produced files, with varying debug options and so forth, can then be
       managed more flexibly through a build cache.

       The key differences between a build cache and a repository are:

       •   A build cache can only store files created by the build procedure.  A repository can
           also have original source files.

       •   Files in a repository should not change during the course of a build.  A build cache
           does not have any such restriction.

       •   Files in a repository must be present in the same relative position as the files in
           the build directory.  E.g., if makepp needs the file subdir1/subdir2/xyz.abc, then it
           only looks at repository_root/subdir1/subdir2/xyz.abc.  Files in a build cache have
           lost all directory hierarchy information, and are looked up only based on the inputs
           and the command that were required to produce them.

       •   Files in a repository are soft-linked into their new locations in the build
           directories.  Files in a build cache are either copied or hard-linked into their new
           locations.  If a copy is necessary, a repository will certainly be faster.

       •   Build caches cost a bit of time to put files into them.  A repository does not have
           any extra cost (for the current run, that is, there was of course the cost of creating
           it beforehand), but often requires a bit more advance planning.

       In general, a repository is more useful if you have a single central build that you want
       all developers to take files from.  A build cache is what you want if you have a
       decentralized system where one developer should borrow compiled files from any other
       developer.

       Both build caches and repositories can help with variant builds.  For example, if you want
       to compile all your sources optimized, then again with debugging, then again optimized,
       you can avoid recompiling all the optimized files again by using either a repository or a
       build cache.  To do this with a repository, you have to think ahead and explicitly tell
       makepp to use a repository for the debugging compilation, or else it will wipe out your
       initial optimized compilation.  With a build cache, makepp goes ahead and wipes out the
       initial optimized compilation but can get it back quickly.

   Build cache grouping
       A group is a loose coupling of build caches.  It is loose in the sense that makepp doesn't
       deal with it, so as to not slow down its build cache management.  To benefit from this you
       have to use the offline utility.  Notably the "clean" command also performs the
       replication.  If you give an unrealistic cleaning criterion, like "--mtime=+1000", no
       cleaning occurs, only replication.

       Grouping allows sharing files with more people, especially if you have your build caches
       on the developers' disks, to benefit from hard linking, which saves submission time and
       disk space.  Hard linking alone, however, is restricted to per disk benefits.

       With grouping the file will get replicated at some time after makepp submitted it to the
       build cache.  This means that the file will get created only once for all disks together.

       On file systems which allow hard linking to symbolic links -- which seems restricted to
       Linux and Solaris -- the file will additionally be physically present on one disk only.
       Additionally it remains on each disk it got created on before you replicated, but only as
       long as it is in use on those disks.  In this scenario with symlinks you may choose one or
       more file systems on which you prefer your files to be physically.  Be aware that
       successfully built files may become unavailable, if the disk they are on physically goes
       offline.  Rebuilding will remedy this, and the impact can be lessened by spreading the
       files over several preferred disks.

       Replication has several interesting uses:

       NFS (possible with copying too)
           You have a central NFS server which provides the preferred build cache.  Each machine
           and developer disk has a local build cache for fast submission.  You either mount back
           all the developer disks to the NFS server, and perform the replication and cleaning
           centrally, or you replicate locally on each NFS client machine, treating only the part
           of the group visible there.

       Unsafe disk (possible with copying too)
           If you compile on a RAM disk (hopefully editing your sources in a repository on a safe
           disk), you can make the safe disks be the preferred ones.  Then replication will
           migrate the files to the safe disks, where they survive a reboot.  After every reboot
           you will have to recreate the RAM disk build cache and add it to the group (which will
           give a warning, harmless in this case, because the other group members still remember
           it).

       Full disk (hard linking to symbolic links only)
           If one of your disks is notoriously full, you can make the build caches on all the
           other disks be preferred.  That way replication will migrate the files away from the
           full disk, randomly to any of the others.

   How to use a build cache
       How to tell makepp to use the build cache

       Once the build cache has been created, it is now available to makepp.  There are several
       options you can specify during creation; see "How to manage a build cache" for details.

       A build cache is specified with the --build-cache command line option, with the
       build_cache statement within a makefile, or with the :build_cache rule modifier.

       The most useful ways that I have found so far to work with build caches are:

       •   Set the build cache path in the environment variable MAKEPPFLAGS, like this (first
           variant for Korn Shell or bash, second for csh):

               export MAKEPPFLAGS=--build-cache=/path/to/build/cache
               setenv MAKEPPFLAGS --build-cache=/path/to/build/cache

           Now every build that you run will always use this build cache, and you don't need to
           modify anything else.

       •   Specify the build cache in your makefiles with a line like this:

               BUILD_CACHE := /path/to/build_cache
               build_cache $(BUILD_CACHE)

           You have to put this in all makefiles that use a build cache (or in a common include
           file that all the makefiles use).  Or put this into your RootMakeppfile:

               BUILD_CACHE := /path/to/build_cache
               global build_cache $(BUILD_CACHE)

           On a multiuser machine you might set up one build cache per home disk to take
           advantage of links.  You might find it more convenient to use a statement like this:

               build_cache $(find_upwards our_build_cache)

           which searches upwards from the current directory in the current file system until it
           finds a directory called our_build_cache.  This can be the same statement for all
           users and still individually point to the cache on their disk.

           Solaris 10 can do some fancy remounting of home directories.  Your home will
           apparently be a mount point of its own, called /home/$LOGNAME, when in fact it is on
           one of the /export/home* disks alongside those of other users.  Because it's not
           really a separate filesystem, links still work.  But you can't search upwards.
           Instead you can do:

               BUILD_CACHE := ${makeperl </export/home*/$(LOGNAME)/../makepp_bc>}

       Build caches and signatures

       Makepp looks up files in the build cache according to their signatures.  If you are using
       the default signature method (file date + size), makepp will only pull files out of the
       build cache if the file date of the input files is identical.  Depending on how your build
       works, the file dates may never be identical.  For example, if you check files out into
       two different directory hierarchies, the file dates are likely to be the time you checked
       the files out, not the time the files were checked in (depending, of course, on your
       version control software).

       What you probably want is to pull files out of the build cache if the file contents are
       identical, regardless of the date.  If this is the case, you should be using some sort of
       a content-based signature.  Makepp does this by default for C and C++ compilations, but it
       uses file dates for any other kinds of files (e.g., object files, or any other files in
       the build process not specifically recognized as a C source or include file).  If you want
       other kinds of files to work with the build cache (i.e., if you want it to work with
       anything other than C/C++ compilation commands), then you could put a statement like this
       somewhere near the top of your makefile:

           signature md5

       to force makepp to use signatures based on the content of files rather than their date.

       How not to cache certain files

       There may be certain files that you know you will never want to cache.  For example, if
       you embed a datestamp into a file, you know that you will never under any circumstances
       want to fetch a previous copy of the file out of the build cache, because the date stamp
       is different.  In this case, it is just a waste of time and disk space to copy it into the
       build cache.

       Or, you may think it is highly unlikely that you will want to cache the final executable.
       You might want to cache individual objects or shared objects that go into making the
       executable, but it's often pretty unlikely that you will build an exactly identical
       executable from identical inputs.  Again, in this case, using a build cache is a waste of
       disk space and time, so it makes sense to disable it.

       Sometimes a file may be extremely quick to generate, and it is just a waste to put it into
       the build cache since it can be generated as quickly as copied.  You may want to
       selectively disable caching of these files.

       You can turn off the build cache for specific rules by specifying ": build_cache none" in
       a rule, like this:

           our_executable: dateStamp.o main.o */*.so
               : build_cache none
               $(CC) $(LDFLAGS) $(inputs) -o $(output)

       This flag means that any outputs from this particular rule will never be put into the
       build cache, and makepp will never try to pull them out of the build cache either.

   How to manage a build cache
       makepp_build_cache_control command ...
       mppbcc command ...

       makepp_build_cache_control, mppbcc is a utility that administers build caches for makepp.
       What makepp_build_cache_control does is determined by the first word of its argument.

       In fact this little script is a wrapper to the following command, which you might want to
       call directly in your cron jobs, where the path to "makeppbuiltin" might be needed:

           makeppbuiltin -MMpp::BuildCacheControl command ...

       You can also use these commands from a makefile after loading them, with a "&"-prefix as
       follows for the example of "create":

           perl { use Mpp::BuildCacheControl } # It's a Perl module, so use instead of include.

           my_cache:
               &create $(CACHE_OPTIONS) $(output) # Call a loaded builtin.

           build_cache $(prebuild my_cache)

       The valid commands, which also take a few of the standard options described in
       makepp_builtins, are:

       create [option ...] path/to/cache ...
           Creates the build caches with the given options.  Valid options are:

           Standard options: "-A, --args-file, --arguments-file=filename, -v, --verbose"

           -e group
           --extend=group
           --extend-group=group
               Add the new build cache to the "group".  This may have been a single stand alone
               build cache up to now.

           -f
           --force
               This allows to create the cache even if path/to/cache already existed.  If it was
               a file it gets deleted.  If it was a directory, it gets reused, with whatever
               content it had.

           -p
           --preferred
               This option is only meaningful if you have build caches in the group, which allow
               hard linking to symlinks.  In that case cleaning will migrate the members to the
               preferred disk.  You may create several caches within a group with this option, in
               which case the files will be migrated randomly to them.

           -s n1,n2,...
           --subdir-chars=n1,n2,...
               Controls how many levels of subdirectories are created to hold the cached files,
               and how many files will be in each subdirectory.  The first n1 characters of the
               filename form the top level directory name, and the characters from n1 to n2 form
               the second level directory name, and so on.

               Files in the build cache are named using MD5 hashes of data that makepp uses, so
               each filename is 22 base64 digits plus the original filename.  If a build cache
               file name is 0123456789abcdef012345_module.o, it is actually stored in the build
               cache as 01/23/456789abcdef012345_module.o if you specify "--subdir-chars 2,4".
               In fact, "--subdir-chars 2,4" is the default, which is for a gigantic build cache
               of maximally 4096 dirs with 416777216 subdirs.  Even "--subdir-chars 1,2" or
               "--subdir-chars 1" will get you quite far.  On a file system optimized for huge
               directories you might even say "-s ''" or "--subdir-chars=" to store all files at
               the top level.

           -m perms
           --mode=perms
           --access-permissions=perms
               Specifies the directory access permissions when files are added to the build
               cache.  If you want other people to put files in your build cache, you must make
               it group or world writable.  Permissions must be specified using octal notation.

               As these are directory permissions, if you grant any access, you must also grant
               execute access, or you will get a bunch of weird failures.  I.e. 0700 means that
               only this user may have access to this build cache. 0770 means that this user and
               anyone in the group may have write access to the build cache.  0777 means that
               anyone may have access to the build cache.  The sensible octal digits are 7
               (write), 5 (read) or 0 (none).  3 (write) or 1 (read) is also possible, allowing
               the cache to be used, but not to be browsed, i.e. it would be harder for a
               malicious user to find file names to manipulate.

               In a group of build caches each one has its own value for this, so you can enforce
               different write permissions on different disks.

               If you don't specify the permissions, your umask permissions at creation time
               apply throughout the lifetime of the build cache.

       clean [option ...] /path/to/cache ...
           Cleans up the cache.  Makepp never deletes files from the build cache; it is up to you
           to delete the files with this command.  For multiuser caches the sysop can do this.

           Only files with a link count of 1 are deleted (because otherwise, the file doesn't get
           physically deleted anyway -- you'd just uncache a file which someone is apparently
           still interested in, so somebody else might be too).  The criteria you give pertain to
           the actual cached files.  Each build info file will be deleted when its main file is.
           No empty directories will be left.  Irrespective of the link count and the options you
           give, any file that does not match its build info file will be deleted, if it is older
           than a safety margin of 10 minutes.

           The following options take a time specification as an argument.  Time specs start with
           a "+" meaning longer ago, a "-" meaning more recently or nothing meaning between the
           number you give, and one more.  Numbers, which may be fractional, are by default days.
           But they may be followed by one of the letters "w" (weeks), "d" (days, the default),
           "h" (hours), "m" (minutes) or "s" (seconds).  Note that days are simply 24 real hours
           ignoring any change between summer and winter time.  Examples:

               1           between 24 and 48 hours ago
               24h         between 24 and 25 hours ago
               0.5d        between 12 and 36 hours ago
               1w          between 7 and 14 times 24 hours ago
               -2          less than 48 hours ago
               +30m        more than 30 minutes ago

           All the following options are combined with "and".  If you want several sets of
           combinations with "or", you must call this command repeatedly with different sets of
           options.  Do the ones where you expect the most deletions first, then the others can
           be faster.

           Standard options: "-A, --args-file, --arguments-file=filename, -v, --verbose"

           -a spec
           --atime spec
           --access-time spec
               The last time the file was read.  For a linked file this can happen anytime.
               Otherwise this is the last time the file was copied.  On badly behaved systems
               this could also be the last tape backup or search index creation time.  You could
               try to exclude the cache from such operations.

               Some file systems do not support the atime field, and even if the file system
               does, sometimes people turn off access time on their file systems because it adds
               a lot of extra disk I/O which can be harmful on battery powered notebooks, or in
               disk speed optimization.  (But this is potentially fixable -- see the
               UTIME_ON_IMPORT comment in Mpp/BuildCache.pm.)

           -b
           --blend
           --blend-groups
               Usually each /path/to/cache you specify will separately treat the group of build
               caches it belongs to.  Each group gets treated only once, even if you specify
               several pathes from the same group.  With this option you temporarily blend all
               the groups you specify into one group.

               Doing this for clean may have unwanted effects, if you can hard link to symlinks,
               because it may migrate members from one group to another.  Subsequent non blended
               cleans, may then clean them form the original group prematurely.

           -c spec
           --ctime spec
           --change-time spec
               The last change time of the file's inode.  In a linking situation this could be
               the time when the last user recreated the file differently, severing his link to
               the cache.  This could also be the time the "--set-user" option below had to
               change the user.  On well behaved systems this could also be the time when the
               last tape backup or search index creation covered its marks by resetting the
               atime.

           -m spec
           --mtime spec
           --modification-time spec
               The last modification time of the file.  As explained elsewhere it is discouraged
               to have makepp update a file.  So the last modification will usually be the time
               of creation.  (But in the future makepp may optionally update the mtime when
               deleting files.  This is so that links on atime-less filesystems or copies can be
               tracked.)

           -g group
           --newgrp=group
           --new-group=group
               Set the effective and real group id to group (name or numeric).  Only root may be
               able to do this.  This is needed when you use grouped build caches, and you
               provide write access to the caches based on group id.  Usually that will not be
               root's group and thus replication would create unwritable directories without this
               option.

               This option is named after the equivalent utility "newgrp" which alas can't easily
               be used in "cron" jobs or similar setups.

           -i
           --build-info
           --build-info-check
               Check that the build info matches the member.  This test is fairly expensive so
               you might consider not giving this option in the daytime.

           -l
           --symlink-check
           --symbolic-link-check
               This option makes "clean" read every symbolic link which has no external hard
               links to verify that it points to the desired member.  As this is somewhat
               expensive, it is suggested doing this only at night.

           -M spec
           --in-mtime spec
           --incoming-modification-time spec
               The last modification time for files in the incoming directory.  This directory is
               used for temporary files with process-specific names that can be written free of
               concurrent access and then renamed into the active part of the cache atomically.
               Files normally live here only for as long as it takes to write them, but they can
               get orphaned if the process that is writing them terminates abnormally before it
               can remove them.  This part of the cache is cleaned first, because the link counts
               in the active part of the cache can be improperly affected by orphaned files.

               The timespec for "--incoming-modification-time" must begin with "+", and defaults
               to "+2h" (files at least 2 hours old are assumed to have been orphaned).

           -w
           --workdays
               This influences how the time options count.  Weekends are ignored, as though they
               weren't there.  An exception is if you give this option on a weekend.  Then that
               weekend counts normally.  So you can use it in cronjobs that run from Tuesday
               through Saturday.  Summertime is ignored.  So summer weekends can go from Saturday
               1:00 to Monday 1:00, or southern hemisphere winter weekends from Friday 23:00 to
               Sunday 23:00 or however much your timezone changes the time.  Holidays are also
               not taken into account.

           -p perlcode
           --perl=perlcode
           --predicate=perlcode
               TODO: adapt this description to group changes!

               This is the Swiss officer's knife.  The perlcode is called in scalar context once
               for every cache entry (i.e. excluding directories and metainfo files).  It is
               called in a "File::Find" "wanted" function, so see there for the variables you can
               use.  An "lstat" has been performed, so you can use the "_" filehandle.

               If perlcode returns "undef" it is as if it weren't there, that is the other
               options decide.  If it returns true the file is deleted.  If it returns false, the
               file is retained.

           -s spec
           --size spec
               The file size specification works just like time specifications, with "+" for
               bigger than or "-" for smaller than, except that the units must be "c" (bytes, the
               default), "k" (kilobytes), "M" (megabytes) or "G" (gigabytes).

           -u user
           --user=user
           --set-user=user
               This option is very different.  It does not say when to delete a file.  Instead it
               applies to the files that do not get deleted.  Note that on many systems only root
               is allowed to set the user of a file.  See under "Caveats working with build
               caches" why you might need to change ownership to some neutral user if you use
               disk quotas.

               This strategy only works if you can trust your users not to subvert the build
               cache for storing arbitrary (i.e. non-development) files beyond their disk quota.
               The ownership of the associated metadata file is retained, so you can always see
               who cached a file.  If you need this option, it might need to be given several
               times during the daytime.

           There are different possible strategies, depending on how much space you have and on
           whether the build cache contains linked files or whether users only have copies.
           Several strategies can be combined, by calling them one after another or at different
           times.  The "show" command is meant to help you find an appropriate strategy.

           A nightly (from Tuesday through Saturday) run might specify "--atime +2" (or "--mtime"
           if you don't have atime), deleting all files no one has read for two days.

           If you use links, you can also prevent fast useless growth which occurs when
           successive header changes, which never get version controlled, lead to lots of objects
           being rapidly created.  Something like an hourly run with "--mtime=-2h --ctime=+1h"
           during the daytime will catch those guys the creator deleted within less than an hour,
           and nobody else has wanted since.

       show [option ...] /path/to/cache ...
           This is a sort of recursive "ls -l" or "stat" command, which shows the original owner
           too, for when the owner of the cached file has been changed and the metadata file
           retains the original owner (as per "clean --set-user").  It shows the given files, or
           all under the directories given.

           The fields are, in the short standard and the long verbose form:

           MODE, mode
               The octal mode of the cached file, which is usually as it got put in, minus the
               write bits.

           EL, ext-links
               The number external hard links there are to all members of the group combined.
               Only when this is 0, is the file eligible for cleaning.

           C, copies (only for grouped build caches)
               The number of copies of the identical file, across all build caches.  Ideally this
               is one on systems which permit hard linking to symbolic links, but that may
               temporarily not be possible, while there are external links to more than one copy
               (in which case we'd lose the link count if we deleted it.

           S, symlinks (only for grouped build caches)
               The number of symbolic links between build caches.  Ideally this is the number of
               build caches minus one on systems which permit hard linking to symbolic links.
               But as explained for the previous field, there may be more copies than necessary,
               and thus less links.

           UID The owner of the cached file.  This may be changed with the "clean --user" option.

           BI-UID
               The owner of the build info file.  This is not changed by clean, allowing to see
               who first built the file.

           SIZE
               The size (of one copy) in bytes.

           atime, mtime, ctime
               In the long verbose form you get the file access (read) time, the modification
               time and the inode change time (e.g. when some user deleted his external link to
               the cached file).  In the short standard form you get only one of the three times
               in three separate columns:

           AD, MD, CD
               The week day of the access, modification or inode change.

           ADATE, MDATE, CDATE
               The date of the access, modification or inode change.

           ATIME, MTIME, CTIME
               The day time of the access, modification or inode change.

           MEMBER
               The full path of the cached file, including the key, from the cache root.

           With "-v, --verbose" the information shown for each command allows you to get an
           impression which options to give to the "clean" command.  The times are shown in
           readable form, as well as the number of days, hours or minutes the age of this file
           has just exceeded.  If you double the option, you additionally get the info for each
           group member.

           Standard options: "-A, --args-file, --arguments-file=filename, -f, --force, -o,
           --output=filename, -O, --outfail, -v, --verbose"

           -a
           --atime
           --access-time
               Show the file access time, instead of file modification time in non-verbose mode.

           -b
           --blend
           --blend-groups
               Usually each /path/to/cache you specify will separately treat the group of build
               caches it belongs to.  Each group gets treated only once, even if you specify
               several pathes from the same group.  With this option you temporarily blend all
               the groups you specify into one group.

           -c
           --ctime
           --change-time
               Show the inode info change time, instead of file modification time in non-verbose
               mode.

           -d
           --deletable
               Show only deletable files, i.e. those with an external link count of 0.

           -p pattern
           --pattern=pattern
               Pattern is a bash style file name pattern (i.e. ?, *, [], {,,}) matched against
               member names after the underscore separating them from the key.

           -s list
           --sort=list
               In non-verbose mode change the sorting order.  The list is a case insensitive
               comma- or space-separated order of column titles.  There are two special cases:
               "member" only considers the names after the key, i.e. the file names as they are
               outside of the cache.  And there is a special name "age", which groups whichever
               date and time is being shown.  This option defaults to "member,age".

               If you have a huge cache for which sorting takes intolerably long, or needs more
               memory than your processes are allowed, you can skip sorting by giving an empty
               list.

       stats [option ...] /path/to/cache ...
           This outputs several tables of statistics about the build cache contents.  Each table
           is split into three column groups.  The first column varies for each table and is the
           row heading.  The other two groups pertain to sum of SIZE of files and number of FILES
           for that heading.  Directories and build info files are not counted, so this is a
           little less for size than actual disk usage and about half for number of files.

           Each of the latter two groups consists of three column pairs, one column with a value,
           and one for the percentage of the total that value represents.  The first pair shows
           either the size of files or the number of files.  The other two pairs show the
           CUMULation, once from smallest to biggest and once the other way round.

           The first three tables, with a first column of AD, CD or MD show access times, inode
           change times or modification times grouped by days.  Days are actually 24 hour blocks
           counting backwards from the start time of the stats command.  The row "0" of the first
           table will thus show the sum of sizes and the number of files accessed less than a day
           ago.  If no files were accessed then, there will be no row "0".  Row "1" in the third
           table will show the files modified (i.e. written to the build cache) between 24 and 48
           hours ago.

           The next table, EL, shows external links, i.e. how many build trees share a file from
           the build cache.  This is a measure of usefulness of the build cache.  Alas it only
           works when developers have a buld cache on their own disk, else they have to copy
           which leaves no global trail.  The more content has bigger external link counts, the
           bigger the benefit of the build cache.

           The next table, again EL, shows the same information as the previous one, but weighted
           by the number of external links.  Each byte or file with an external link count of one
           counts as one.  But if the count is ten, the values are counted ten times.  That's why
           the headings change to *SIZE and *FILES.  This is a hypothetical value, showing how
           much disk usage or how many files there would be if the same build trees had all used
           no build cache.

           One more table, C:S copies to symlinks, pertains to grouped caches only.  Ideally all
           members exist in one copy, and one less symlinks than there are caches in the group.
           Symlinks remain "0" until cleaning has replicated.  There may be more than one copy,
           if either several people created the identical file before it was replicated, or if
           replication migrated the file to a preferred disk, but the original file was still in
           use.  Superfluous copies become symlinks when cleaning finds they have no more
           external links.

           Standard options: "-A, --args-file, --arguments-file=filename, -v, --verbose"

           -h
           --hours
               Display the first three tables in much finer granularity.  The column headings
               change to AH, CH or MH accordingly.

           -p pattern
           --pattern=pattern
               Pattern is a bash style file name pattern (i.e. ?, *, [], {,,}) matched against
               member names after the underscore separating them from the key.  All statistics
               are limited to matching files.

   Caveats working with build caches
       Build caches will not work well under the following circumstances:

       •   If the command that makepp runs to build a file actually only updates the file and
           does not build it fresh, then you should NOT use a build cache.  (An example is a
           command to update a module in a static library (an archive file, or a file with an
           extension of .a).  As explained in makepp_cookbook, on modern machines it is almost
           always a bad idea to update an archive file--it's better to rebuild it from scratch
           each time for a variety of reasons.  This is yet another reason not to update an
           archive file.)  The reason is that if the build cache happens to be located on the
           same file system, makepp makes a hard link rather than copying the file.  If you then
           subsequently modify the file, the file that makepp has in the build cache will
           actually be modified, and you could potentially screw up someone else's compilation.
           In practice, makepp can usually detect that a file has been modified since it was
           placed in the build cache and it won't use it, but sometimes it may not actually
           detect the modification.

       •   For .o files this can be slightly wrong, because they may (depending on the compiler
           and debug level) contain the path to the source they were built from.  This can make
           debugging hard.  The debugger may make you edit the original creator's copy of the
           source, or may not even find the file, if the creator no longer has a copy.  Makepp
           may someday offer an option to patch the path, which will of course mean a copy,
           instead of an efficient link.

       •   Any other file which has a path encoded into it should not be put into a build cache
           (if you share your build cache among several directory hierarchies or several
           developers).  In this case, the result of a build in a different directory is not the
           same as if it were in the same directory, so the whole concept of the build cache is
           not applicable.  It's ok if you specify the directory path on the command line, like
           this:

               &echo prog_path=$(PWD) -o $(output)

           because then the command line will be different and makepp won't incorrectly pull the
           file out of the build cache.  But if the command line is not different, then there
           could be a problem.  For example,

               echo prog_path=`pwd` > $(output)

           will not work properly.

       •   When using links and with many active developers of the same project on the same disk,
           build caches can save a lot of disk space.  But at the same time for individual users
           the opposite can also be true:

           Imagine Chang is the first to do a full build.  Along comes Ching and gets a link to
           all those files.  Chang does some fundamental changes leading to most things being
           rebuilt.  He checks them in, Chong checks them out and gets links to the build cache.
           Chang again does changes, leading to a third set of files.

           In this scenario, no matter what cleaning strategy you use, no files will get deleted,
           because they are all still in use.  The problem is that they all belong to Chang,
           which can make him reach his disk quota, and there is nothing he can do about it on
           most systems.  See the "clean --set-user" command under "How to manage a build cache"
           for how the system administrator could change the files to a quota-less cache owner.

       •   If you are using timestamp/size signatures to cross check the target and its build
           info (the default), then it is possible to get a signature alias, wherein non-
           corresponding files will not be detected.  For example, the MD5_SUM build info value
           may not match the MD5 checksum of the target.  This is not usually a problem, because
           by virtue of the fact that the build cache keys match, the target in the build cache
           is substitutable for the target that would have corresponded to the build info file.
           However, if you have rule actions that depend on build info, then this could get you
           into trouble (so don't do that).  If this worries you, then use the --md5-check-bc
           option.

   Concurrent access
       Build caches need to support concurrent access, which implies that the implementation must
       be tolerant of races.  In particular, a file might get aged (deleted) between the time
       makepp decides to import a target and the time the import completes.

       Furthermore, some people use build caches over NFS, which is not necessarily coherent.  In
       other words, the order of file creation and deletion by the writer on one host will not
       necessarily match the order seen by a reader on another host, and therefore races cannot
       be resolved by paying particular attention to the order of file operations.  (But there is
       usually an NFS cache timeout of about 1 minute which guarantees that writes will take no
       longer than that amount of time to propagate to all readers.  Furthermore, typically in
       practice at least 99% of writes are visible everywhere within 1 second.)  Because of this,
       we must tolerate the case in which the cached target and its build info file appear not to
       correspond.  Furthermore, there is a peculiar race that can occur when a file is
       simultaneously aged and replaced, in which the files don't correspond even after the NFS
       cache flushes.  This appears to be unavoidable.