lunar (1) agedu.1.gz

Provided by: agedu_20211129.8cd63c5-1_amd64 bug

NAME

       agedu - correlate disk usage with last-access times to identify large and disused data

SYNOPSIS

       agedu [ options ] action [action...]

DESCRIPTION

       agedu  scans  a  directory  tree and produces reports about how much disk space is used in
       each directory and subdirectory, and also how that usage  of  disk  space  corresponds  to
       files with last-access times a long time ago.

       In  other words, agedu is a tool you might use to help you free up disk space. It lets you
       see which directories are taking up the most space, as du does; but  unlike  du,  it  also
       distinguishes between large collections of data which are still in use and ones which have
       not been accessed in months or years - for instance, large archives downloaded,  unpacked,
       used  once,  and  never  cleaned up. Where du helps you find what's using your disk space,
       agedu helps you find what's wasting your disk space.

       agedu has several operating modes. In one mode, it scans your disk  and  builds  an  index
       file  containing  a data structure which allows it to efficiently retrieve any information
       it might need. Typically, you would use it in this mode first, and then run it in one of a
       number  of  `query'  modes  to  display  a  report of the disk space usage of a particular
       directory and its subdirectories. Those reports can be produced as plain text  (much  like
       du)  or as HTML. agedu can even run as a miniature web server, presenting each directory's
       HTML report with hyperlinks to let you navigate around the file system to similar  reports
       for other directories.

       So  you  would  typically start using agedu by telling it to do a scan of a directory tree
       and build an index. This is done with a command such as

       $ agedu -s /home/fred

       which will build a large data file called agedu.dat in your current  directory.  (If  that
       current  directory  is  inside /home/fred, don't worry - agedu is smart enough to discount
       its own index file.)

       Having built the index, you would now query it for reports of disk  space  usage.  If  you
       have a graphical web browser, the simplest and nicest way to query the index is by running
       agedu in web server mode:

       $ agedu -w

       which will print (among other messages) a URL on its standard output along the lines of

       URL: http://127.0.0.1:48638/

       (That URL will always begin with `127.', meaning that it's in the localhost address space.
       So only processes running on the same computer can even try to connect to that web server,
       and also there is access control to prevent other users from seeing it  -  see  below  for
       more detail.)

       Now paste that URL into your web browser, and you will be shown a graphical representation
       of the disk usage in /home/fred and its immediate  subdirectories,  with  varying  colours
       used  to  show  the  difference  between  disused and recently-accessed data. Click on any
       subdirectory to descend into it and see a report for its subdirectories in turn; click  on
       parts  of  the pathname at the top of any page to return to higher-level directories. When
       you've finished browsing, you can just press Ctrl-D to send an end-of-file  indication  to
       agedu, and it will shut down.

       After  that, you probably want to delete the data file agedu.dat, since it's pretty large.
       In fact, the command agedu -R will do this for you; and you can chain  agedu  commands  on
       the same command line, so that instead of the above you could have done

       $ agedu -s /home/fred -w -R

       for a single self-contained run of agedu which builds its index, serves web pages from it,
       and cleans it up when finished.

       In some situations, you might want to scan the directory structure of  one  computer,  but
       run  agedu's user interface on another. In that case, you can do your scan using the agedu
       -S option in place of agedu -s, which will make agedu not bother building  an  index  file
       but instead just write out its scan results in plain text on standard output; then you can
       funnel that output to the other  machine  using  SSH  (or  whatever  other  technique  you
       prefer),  and  there,  run  agedu -L to load in the textual dump and turn it into an index
       file. For example, you might run a command like this (plus any ssh options  you  need)  on
       the machine you want to scan:

       $ agedu -S /home/fred | ssh indexing-machine agedu -L

       or, equivalently, run something like this on the other machine:

       $ ssh machine-to-scan agedu -S /home/fred | agedu -L

       Either  way,  the agedu -L command will create an agedu.dat index file, which you can then
       use with agedu -w just as above.

       (Another way to do this might be to build the index file on the first machine  as  normal,
       and  then  just  copy it to the other machine once it's complete. However, for efficiency,
       the index file is formatted differently depending on the CPU architecture  that  agedu  is
       compiled  for. So if that doesn't match between the two machines - e.g. if one is a 32-bit
       machine and one 64-bit - then agedu.dat files written on one machine will not work on  the
       other.  The  technique  described  above  using  -S  and  -L  should  work between any two
       machines.)

       If you don't have a graphical web browser, you can do text-based queries instead of  using
       agedu's  web  interface. Having scanned /home/fred in any of the ways suggested above, you
       might run

       $ agedu -t /home/fred

       which  again  gives  a  summary  of  the  disk  usage  in  /home/fred  and  its  immediate
       subdirectories;  but  this  time  agedu will print it on standard output, in much the same
       format as du. If you then want to find out how much old data is there, you can add the  -a
       option to show only files last accessed a certain length of time ago. For example, to show
       only files which haven't been looked at in six months or more:

       $ agedu -t /home/fred -a 6m

       That's the essence of what agedu does. It has other modes of operation  for  more  complex
       situations,  and the usual array of configurable options. The following sections contain a
       complete reference for all its functionality.

OPERATING MODES

       This section describes the operating modes supported by agedu. Each of  these  is  in  the
       form of a command-line option, sometimes with an argument. Multiple operating-mode options
       may appear on the command line, in which case agedu will perform the specified actions one
       after another. For instance, as shown in the previous section, you might want to perform a
       disk scan and immediately launch a web server giving reports from that scan.

       -s directory or --scan directory
              In this mode, agedu scans the file system starting at the specified directory,  and
              indexes  the results of the scan into a large data file which other operating modes
              can query.

              By default, the scan is restricted to a single file system (since the expected  use
              of  agedu is that you would probably use it because a particular disk partition was
              running low on space). You can remove that restriction using the --cross-fs option;
              other  configuration  options  allow  you  to  include  or  exclude files or entire
              subdirectories from the scan.  See  the  next  section  for  full  details  of  the
              configurable options.

              The index file is created with restrictive permissions, in case the file system you
              are scanning contains confidential information in its structure.

              Index files are dependent on  the  characteristics  of  the  CPU  architecture  you
              created  them  on.  You  should not expect to be able to move an index file between
              different types of computer and have it continue to work. If you need  to  transfer
              the  results  of  a  disk  scan  to a different kind of computer, see the -D and -L
              options below.

       -w or --web
              In this mode, agedu expects to find an index file already written. It  allocates  a
              network  port,  and  starts  up  a  web  server  on  that port which serves reports
              generated from the index file. By default it invents its own URL and prints it out.

              The web server runs until agedu receives  an  end-of-file  event  on  its  standard
              input.  (The  expected  usage is that you run it from the command line, immediately
              browse web pages until you're satisfied, and then press Ctrl-D.) To disable the EOF
              behaviour, use the --no-eof option.

              In  case  the  index  file  contains  any  confidential information about your file
              system, the web server protects the pages it serves from access by other people. On
              Linux,  this  is  done  transparently  by means of using /proc/net/tcp to check the
              owner of each incoming connection; failing that, the  web  server  will  require  a
              password  to  view  the  reports,  and agedu will print the password it invented on
              standard output along with the URL.

              Configurable options for this mode let you specify your own address and port number
              to  listen on, and also specify your own choice of authentication method (including
              turning authentication off completely) and a username and password of your choice.

       -t directory or --text directory
              In this mode, agedu generates a textual report on standard output, listing the disk
              usage  in the specified directory and all its subdirectories down to a given depth.
              By default that depth is 1, so that you see a report for directory itself  and  all
              of  its  immediate subdirectories. You can configure a different depth (or no depth
              limit) using -d, described in the next section.

              Used on its own, -t merely lists the total disk usage in each subdirectory; agedu's
              additional  ability to distinguish unused from recently-used data is not activated.
              To activate it, use the -a option to specify a minimum age.

              The directory structure stored in agedu's index file is treated as a set of literal
              strings. This means that you cannot refer to directories by synonyms. So if you ran
              agedu -s ., then all the path names you later pass to the -t option must be  either
              `.'  or begin with `./'. Similarly, symbolic links within the directory you scanned
              will not be followed; you must refer to each directory by its  canonical,  symlink-
              free pathname.

       -R or --remove
              In  this  mode,  agedu  deletes its index file. Running just agedu -R on its own is
              therefore equivalent to typing rm agedu.dat. However, you can also put  -R  on  the
              end  of a command line to indicate that agedu should delete its index file after it
              finishes performing other operations.

       -S directory or --scan-dump directory
              In this mode, agedu will scan a directory tree and  convert  the  results  straight
              into  a  textual  dump on standard output, without generating an index file at all.
              The dump data is intended for agedu -L to read.

       -L or --load
              In this mode, agedu expects to read a dump produced  by  the  -S  option  from  its
              standard  input.  It  constructs  an index file from that dump, exactly as it would
              have if it had read the same data from a disk scan in -s mode.

       -D or --dump
              In this mode, agedu reads an existing  index  file  and  produces  a  dump  of  its
              contents  on  standard  output,  in  the same format used by -S and -L. This option
              could be used to convert an existing index file  into  a  format  acceptable  to  a
              different  kind  of computer, by dumping it using -D and then loading the dump back
              in on the other machine using -L.

              (The output of agedu -D on an existing index file will not be exactly identical  to
              what  agedu  -S would have originally produced, due to a difference in treatment of
              last-access times on directories. However, it should be effectively equivalent  for
              most  purposes. See the documentation of the --dir-atime option in the next section
              for further detail.)

       -H directory or --html directory
              In this mode, agedu will generate an HTML report of the disk usage in the specified
              directory  and  its  immediate subdirectories, in the same form that it serves from
              its web server in -w mode.

              By default, a single HTML report will be generated and simply written  to  standard
              output, with no hyperlinks pointing to other similar pages. If you also specify the
              -d option (see below), agedu will instead write out a collection of HTML files with
              hyperlinks between them, and call the top-level file index.html.

       --cgi  In  this  mode,  agedu will run as the bulk of a CGI script which provides the same
              set of web pages as the built-in web server would.  It  will  read  the  usual  CGI
              environment variables, and write CGI-style data to its standard output.

              The actual CGI program itself should be a tiny wrapper around agedu which passes it
              the --cgi option, and also (probably) -f to locate the index file.  agedu  will  do
              everything else. For example, your script might read

              #!/bin/sh
              /some/path/to/agedu --cgi -f /some/other/path/to/agedu.dat

              (Note  that  agedu  will produce the entire CGI output, including status code, HTTP
              headers and the full HTML document. If you try to surround the call to agedu  --cgi
              with  code that adds your own HTML header and footer, you won't get the results you
              want, and  agedu's  HTTP-level  features  such  as  auto-redirecting  to  canonical
              versions of URIs will stop working.)

              No  access  control is performed in this mode: restricting access to CGI scripts is
              assumed to be the job of the web server.

       --presort and --postsort
              In these two modes, agedu will expect to read a textual data dump from its standard
              input  of  the  form  produced  by  -S  (and -D). It will transform the data into a
              different version of its text dump format, and write  the  transformed  version  on
              standard output.

              The  ordinary dump file format is reasonably readable, but loading it into an index
              file using agedu -L requires it  to  be  sorted  in  a  specific  order,  which  is
              complicated  to  describe  and  difficult  to implement using ordinary Unix sorting
              tools. So if you want to construct your own data dump from a  source  of  your  own
              that  agedu itself doesn't know how to scan, you will need to make sure it's sorted
              in the right order.

              To help with this, agedu provides a secondary dump format which is  `sortable',  in
              the  sense  that  ordinary sort(1) without arguments will arrange it into the right
              order. However, the sortable format is much more  unreadable  and  also  twice  the
              size, so you wouldn't want to write it directly!

              So  the recommended procedure is to generate dump data in the ordinary format; then
              pipe it through agedu --presort to turn it into the sortable format; then sort  it;
              then  pipe  it  into  agedu  -L (which can accept either the normal or the sortable
              format as input). For example:

              generate_custom_data.sh | agedu --presort | sort | agedu -L

              If you need to transform the sorted dump file back into the ordinary format,  agedu
              --postsort  can  do that. But since agedu -L can accept either format as input, you
              may not need to.

       -h or --help
              Causes agedu to print some help text and terminate immediately.

       -V or --version
              Causes agedu to print its version number and terminate immediately.

OPTIONS

       This section describes the various configuration options that affect agedu's operation  in
       one mode or another.

       The following option affects nearly all modes (except -S):

       -f filename or --file filename
              Specifies  the  location  of  the  index file which agedu creates, reads or removes
              depending on its operating  mode.  By  default,  this  is  simply  `agedu.dat',  in
              whatever is the current working directory when you run agedu.

       The following options affect the disk-scanning modes, -s and -S:

       --cross-fs and --no-cross-fs
              These  configure  whether  or  not  the  disk  scan  is  permitted to cross between
              different file systems. The default is  not  to:  agedu  will  normally  skip  over
              subdirectories  on  which  a  different  file  system  is  mounted.  This  makes it
              convenient when you want to free up space on a  particular  file  system  which  is
              running  low.  However,  in  other  circumstances  you  might  wish  to see general
              information about the use of space  no  matter  which  file  system  it's  on  (for
              instance,  if  your  real concern is your backup media running out of space, and if
              your backups do not treat different file systems specially); in that situation, use
              --cross-fs.

              (Note  that this default is the opposite way round from the corresponding option in
              du.)

       --prune wildcard and --prune-path wildcard
              These cause particular files or directories to be omitted entirely from  the  scan.
              If  agedu's  scan  encounters  a  file or directory whose name matches the wildcard
              provided to the --prune option, it will not include that file  in  its  index,  and
              also if it's a directory it will skip over it and not scan its contents.

              Note  that  in  most Unix shells, wildcards will probably need to be escaped on the
              command line, to prevent the shell from expanding the wildcard  before  agedu  sees
              it.

              --prune-path is similar to --prune, except that the wildcard is matched against the
              entire pathname instead of just the filename at the end of it. So  whereas  --prune
              *a*b*  will  match  any  file whose actual name contains an a somewhere before a b,
              --prune-path *a*b* will also match a file whose name contains b and which is inside
              a  directory  containing  an a, or any file inside a directory of that form, and so
              on.

       --exclude wildcard and --exclude-path wildcard
              These cause particular files or directories to be omitted from the index,  but  not
              from  the  scan.  If agedu's scan encounters a file or directory whose name matches
              the wildcard provided to the --exclude option, it will not include that file in its
              index  -  but  unlike --prune, if the file in question is a directory it will still
              scan its contents and index them if they are not ruled out themselves by  --exclude
              options.

              As  above,  --exclude-path  is  similar  to  --exclude, except that the wildcard is
              matched against the entire pathname.

       --include wildcard and --include-path wildcard
              These cause particular files or directories to be re-included in the index and  the
              scan,  if  they  had previously been ruled out by one of the above exclude or prune
              options. You can interleave include, exclude and prune options as you wish  on  the
              command  line,  and  if  more  than one of them applies to a file then the last one
              takes priority.

              For example, if you wanted to see only the disk space taken up by  MP3  files,  you
              might run

              $ agedu -s . --exclude '*' --include '*.mp3'

              which  will cause everything to be omitted from the scan, but then the MP3 files to
              be put back in. If you then wanted only a subset of  those  MP3s,  you  could  then
              exclude  some  of them again by adding, say, `--exclude-path './queen/*'' (or, more
              efficiently, `--prune ./queen') on the end of that command.

              As with the previous two options, --include-path is  similar  to  --include  except
              that the wildcard is matched against the entire pathname.

       --progress, --no-progress and --tty-progress
              When  agedu  is  scanning  a  directory  tree,  it  will typically print a one-line
              progress report every second showing where it has reached in the scan, so  you  can
              have  some  idea  of  how  much  longer  it will take. (Of course, it can't predict
              exactly how long it will take, since it doesn't know which of  the  directories  it
              hasn't scanned yet will turn out to be huge.)

              By default, those progress reports are displayed on agedu's standard error channel,
              if that channel points to a terminal device. If you  need  to  manually  enable  or
              disable   them,  you  can  use  the  above  three  options  to  do  so:  --progress
              unconditionally  enables  the  progress  reports,   --no-progress   unconditionally
              disables  them,  and  --tty-progress  reverts  to  the  default  behaviour which is
              conditional on standard error being a terminal.

       --dir-atime and --no-dir-atime
              In  normal  operation,  agedu  ignores  the  atimes  (last  access  times)  on  the
              directories  it  scans:  it  only  pays attention to the atimes of the files inside
              those directories. This is because directory atimes tend to be reset by  a  lot  of
              system  administrative  tasks, such as cron jobs which scan the file system for one
              reason or another - or even other invocations of agedu itself, though it  tries  to
              avoid  modifying  any  atimes if possible. So the literal atimes on directories are
              typically not representative of how long ago the data in question was last accessed
              with real intent to use that data in particular.

              Instead,  agedu  makes up a fake atime for every directory it scans, which is equal
              to the newest atime of any file in or below that directory (or the directory's last
              modification  time,  whichever is newest). This is based on the assumption that all
              important accesses to directories are actually accesses to the files  inside  those
              directories,  so  that  when  any  file is accessed all the directories on the path
              leading to it should be considered to have been accessed as well.

              In unusual cases it is possible that a directory itself might embody important data
              which is accessed by reading the directory. In that situation, agedu's atime-faking
              policy will misreport the directory as disused. In the  unlikely  event  that  such
              directories  form  a  significant  part of your disk space usage, you might want to
              turn off the faking. The --dir-atime option does this: it causes the disk  scan  to
              read the original atimes of the directories it scans.

              The  faking of atimes on directories also requires a processing pass over the index
              file after the main disk scan is complete. --dir-atime also turns  this  pass  off.
              Hence, this option affects the -L option as well as -s and -S.

              (The  previous section mentioned that there might be subtle differences between the
              output of agedu -s /path -D and agedu -S /path. This is why. Doing a scan  with  -s
              and  then  dumping  it with -D will dump the fully faked atimes on the directories,
              whereas doing a scan-to-dump with -S  will  dump  only  partially  faked  atimes  -
              specifically,  each  directory's  last  modification  time  -  since the subsequent
              processing pass will not have had a chance to take place. However,  loading  either
              of  the resulting dump files with -L will perform the atime-faking processing pass,
              leading to the same data in the index file in each case. In normal usage it  should
              be safe to ignore all of this complexity.)

       --mtime
              This  option causes agedu to index files by their last modification time instead of
              their last access time. You might want to use this if your last access  times  were
              completely useless for some reason: for example, if you had recently searched every
              file on your system, the system would have lost  all  the  information  about  what
              files  you  hadn't recently accessed before then. Using this option is liable to be
              less effective at finding genuinely wasted space than the normal mode (that is,  it
              will  be  more  likely to flag things as disused when they're not, so you will have
              more candidates to go through by hand looking for data you don't need), but may  be
              better than nothing if your last-access times are unhelpful.

              Another  use  for  this  mode might be to find recently created large data. If your
              disk has been gradually filling up for years, the default mode of  agedu  will  let
              you  find  unused  data  to  delete;  but if you know your disk had plenty of space
              recently and now it's suddenly full, and you suspect that some  rogue  program  has
              left a large core dump or output file, then agedu --mtime might be a convenient way
              to locate the culprit.

       --logicalsize
              This option causes agedu to consider the size of each  file  to  be  its  `logical'
              size,  rather  than  the amount of space it consumes on disk. (That is, it will use
              st_size instead of st_blocks in the data returned from stat(2).) This option  makes
              agedu  less  accurate  at  reporting how much of your disk is used, but it might be
              useful in  specialist  cases,  such  as  working  around  a  file  system  that  is
              misreporting physical sizes.

              For  most  files, the physical size of a file will be larger than the logical size,
              reflecting the fact that filesystem layouts generally allocate a  whole  number  of
              blocks  of  the  disk  to each file, so some space is wasted at the end of the last
              block. So counting only the logical file size will typically cause  under-reporting
              of the disk usage (perhaps large under-reporting in the case of a very large number
              of very small files).

              On the other hand, sometimes a file with a very large logical size can have `holes'
              where  no data is actually stored, in which case using the logical size of the file
              will over-report its disk usage. So the use of logical sizes can give wrong answers
              in both directions.

       The  following option affects all the modes that generate reports: the web server mode -w,
       the stand-alone HTML generation mode -H and the text report mode -t.

       --files
              This option causes agedu's reports to list the individual files in each  directory,
              instead  of  just  giving  a  combined  report  for  everything  that's  not  in  a
              subdirectory.

       The following option affects the text report mode -t.

       -a age or --age age
              This option tells agedu to report only files of at least the specified age. An  age
              is specified as a number, followed by one of `y' (years), `m' (months), `w' (weeks)
              or `d' (days). (This syntax is also used by the -r option.) For example, -a 6m will
              produce a text report which includes only files at least six months old.

       The  following  options affect the stand-alone HTML generation mode -H and the text report
       mode -t.

       -d depth or --depth depth
              This option controls the maximum depth to which agedu recurses  when  generating  a
              text or HTML report.

              In  text mode, the default is 1, meaning that the report will include the directory
              given on the command line and all of its immediate subdirectories. A depth  of  two
              includes  another  level  below  that,  and  so  on; a depth of zero means only the
              directory on the command line.

              In HTML mode, specifying this option switches agedu from writing out a single  HTML
              file  to  writing  out  multiple files which link to each other. A depth of 1 means
              agedu will write out an HTML file for the given directory and also one for each  of
              its immediate subdirectories.

              If  you want agedu to recurse as deeply as possible, give the special word `max' as
              an argument to -d.

       -o filename or --output filename
              This option is used to specify an output file for agedu to write its report to.  In
              text  mode or single-file HTML mode, the argument is treated as the name of a file.
              In multiple-file HTML mode, the argument is treated as the name of a directory: the
              directory  will  be created if it does not already exist, and the output HTML files
              will be created inside it.

       The following option affects only the stand-alone HTML generation mode -H, and even  then,
       only in recursive mode (with -d):

       --numeric
              This option tells agedu to name most of its output HTML files numerically. The root
              of the whole output file collection will still be called index.html,  but  all  the
              rest  will  have  names  like  73.html  or 12525.html. (The numbers are essentially
              arbitrary; in fact, they're indices of nodes in the data structure used by  agedu's
              index file.)

              This system of file naming is less intuitive than the default of naming files after
              the sub-pathname they index. It's also less stable:  the  same  pathname  will  not
              necessarily be represented by the same filename if agedu -H is re-run after another
              scan of the same directory tree. However, it does have the virtue that it keeps the
              filenames  short, so that even if your directory tree is very deep, the output HTML
              files won't exceed any OS limit on filename length.

       The following options affect the web server mode -w, and in some  cases  also  the  stand-
       alone HTML generation mode -H:

       -r age range or --age-range age range
              The  HTML reports produced by agedu use a range of colours to indicate how long ago
              data was last accessed, running from red (representing the most  disused  data)  to
              green (representing the newest). By default, the lengths of time represented by the
              two ends of that spectrum are chosen by examining the data file to see  what  range
              of  ages appears in it. However, you might want to set your own limits, and you can
              do this using -r.

              The argument to -r consists of a single age, or two ages separated by a minus sign.
              An  age  is  a number, followed by one of `y' (years), `m' (months), `w' (weeks) or
              `d' (days). (This syntax is also used by the -a option.) The first age in the range
              represents  the  oldest  data, and will be coloured red in the HTML; the second age
              represents the newest, coloured green. If the second age is not specified, it  will
              default to zero (so that green means data which has been accessed just now).

              For  example,  -r  2y  will mark data in red if it has been unused for two years or
              more, and green if it has been accessed just now. -r 2y-3m will similarly mark data
              red  if  it has been unused for two years or more, but will mark it green if it has
              been accessed three months ago or later.

       --address addr[:port]
              Specifies the network address and port number on which  agedu  should  listen  when
              running  its web server. If you want agedu to listen for connections coming in from
              any source, specify the address as the special value ANY. If  the  port  number  is
              omitted, an arbitrary unused port will be chosen for you and displayed.

              If  you specify this option, agedu will not print its URL on standard output (since
              you are expected to know what address you told it to listen to).

       --auth auth-type
              Specifies how agedu should control access to the web pages it serves.  The  options
              are as follows:

              magic  This  option  only  works on Linux, and only when the incoming connection is
                     from the same machine that agedu is running on. On Linux, the  special  file
                     /proc/net/tcp  contains a list of network connections currently known to the
                     operating system kernel, including which user id created them. So agedu will
                     look  up each incoming connection in that file, and allow access if it comes
                     from the same user id under which agedu itself  is  running.  Therefore,  in
                     agedu's  normal  web  server  mode,  you  can  safely run it on a multi-user
                     machine and no other user will be able to read data out of your index file.

              basic  In this mode, agedu will use HTTP Basic authentication: the user  will  have
                     to  provide  a  username and password via their browser. agedu will normally
                     make up a username and password for the purpose, but you  can  specify  your
                     own; see below.

              none   In this mode, the web server is unauthenticated: anyone connecting to it has
                     full access to the reports generated by agedu. Do not do this  unless  there
                     is nothing confidential at all in your index file, or unless you are certain
                     that nobody but you can run processes on your computer.

              default
                     This is the default mode if you do not specify one of  the  above.  In  this
                     mode,  agedu  will  attempt  to  use  Linux  magic authentication, but if it
                     detects at startup time that /proc/net/tcp is absent or non-functional  then
                     it  will fall back to using HTTP Basic authentication and invent a user name
                     and password.

       --auth-file filename or --auth-fd fd
              When agedu is using HTTP Basic authentication, these options allow you  to  specify
              your  own  user  name  and password. If you specify --auth-file, these will be read
              from the specified file; if you specify --auth-fd they will instead be read from  a
              given  file  descriptor  which you should have arranged to pass to agedu. In either
              case, the authentication details should consist of  the  username,  followed  by  a
              colon,  followed  by the password, followed immediately by end of file (no trailing
              newline, or else it will be considered part of the password).

       --title title
              Specify the string that appears at the start of the <title> section of  the  output
              HTML  pages. The default is `agedu'. This title is followed by a colon and then the
              path you're viewing within the index file. You might use this option  if  you  were
              serving  agedu  reports for several different servers and wanted to make it clearer
              which one a user was looking at.

       --launch shell-command
              Specify a command to be run with the base URL of the web user interface,  once  the
              web server has started up. The command will be interpreted by /bin/sh, and the base
              URL will be appended to it as an extra argument word.

              A typical use for this would be `--launch=browse',  which  uses  the  XDG  `browse'
              command  to  automatically  open  the  agedu web interface in your default browser.
              However, other uses are possible: for example, you could provide  a  command  which
              communicates the URL to some other software that will use it for something.

       --no-eof
              Stop  agedu  in  web server mode from looking for end-of-file on standard input and
              treating it as a signal to terminate.

LIMITATIONS

       The data file is pretty large. The core of agedu is the tree-based data structure it  uses
       in  its  index  in  order to efficiently perform the queries it needs; this data structure
       requires O(N log N) storage. This is larger than you might expect; a scan of my  own  home
       directory,  containing  half  a  million  files  and  directories  and about 20Gb of data,
       produced an index file over 60Mb in size. Furthermore, since the data file must be memory-
       mapped during most processing, it can never grow larger than available address space, so a
       really big filesystem may need to be indexed on a 64-bit computer. (This is one reason for
       the existence of the -D and -L options: you can do the scanning on the machine with access
       to the filesystem, and the indexing on a machine big enough to handle it.)

       The data structure also does not usefully permit access control within the data  file,  so
       it  would  be  difficult  -  even given the willingness to do additional coding - to run a
       system-wide agedu scan on a cron job and serve the right subset of reports to each user.

       In certain circumstances, agedu can report false positives  (reporting  files  as  disused
       which  are  in fact in use) as well as the more benign false negatives (reporting files as
       in use which are not). This arises when a file is, semantically speaking,  `read'  without
       actually  being  physically  read. Typically this occurs when a program checks whether the
       file's mtime has changed and only bothers re-reading it if it has; programs which do  this
       include  rsync(1)  and  make(1). Such programs will fail to update the atime of unmodified
       files despite depending on their continued existence; a directory full of such files  will
       be reported as disused by agedu even in situations where deleting them will cause trouble.

       Finally, of course, agedu's normal usage mode depends critically on the OS providing last-
       access times which are at least approximately right. So a file system mounted with Linux's
       `noatime'  option,  or  the  equivalent  on  any  other  OS, will not give useful results!
       (However, the Linux mount option `relatime',  which  distributions  now  tend  to  use  by
       default,  should be fine for all but specialist purposes: it reduces the accuracy of last-
       access times so that they might be wrong by up to 24 hours,  but  if  you're  looking  for
       files that have been unused for months or years, that's not a problem.)

LICENCE

       agedu is free software, distributed under the MIT licence. Type agedu --licence to see the
       full licence text.