Ubuntu Manpage: atop - Advanced System & Process Monitor

NAME

       atop - Advanced System & Process Monitor

SYNOPSIS

       Interactive Usage:

       atop  [-g|-m|-d|-n|-u|-p|-s|-c|-v|-o|-y] [-C|-M|-D|-N|-A] [-afFG1xR] [-L linelen] [-Plabel[,label]...]  [
       interval [ samples ]]

       Writing and reading raw logfiles:

       atop -w rawfile [-a] [-S] [ interval [ samples ]]
       atop -r [ rawfile ] [-b hh:mm ] [-e hh:mm ] [-g|-m|-d|-n|-u|-p|-s|-c|-v|-o|-y] [-C|-M|-D|-N|-A] [-fFG1xR]
       [-L linelen] [-Plabel[,label]...]

DESCRIPTION

The program atop is an interactive monitor to view the load on a Linux system. It shows the occupation
of the most critical hardware resources (from a performance point of view) on system level, i.e. cpu,
memory, disk and network.
It also shows which processes are responsible for the indicated load with respect to cpu and memory load
on process level. Disk load is shown per process if "storage accounting" is active in the kernel.
Network load is shown per process if the kernel module `netatop' has been installed.

Every interval (default: 10 seconds) information is shown about the resource occupation on system level
(cpu, memory, disks and network layers), followed by a list of processes which have been active during
the last interval (note that all processes that were unchanged during the last interval are not shown,
unless the key 'a' has been pressed or unless sorting on memory occupation is done). If the list of
active processes does not entirely fit on the screen, only the top of the list is shown (sorted in order
of activity).
The intervals are repeated till the number of samples (specified as command argument) is reached, or till
the key 'q' is pressed in interactive mode.

When atop is started, it checks whether the standard output channel is connected to a screen, or to a
file/pipe. In the first case it produces screen control codes (via the ncurses library) and behaves
interactively; in the second case it produces flat ASCII-output.

In interactive mode, the output of atop scales dynamically to the current dimensions of the
screen/window.
If the window is resized horizontally, columns will be added or removed automatically. For this purpose,
every column has a particular weight. The columns with the highest weights that fit within the current
width will be shown.
If the window is resized vertically, lines of the process/thread list will be added or removed
automatically.

Furthermore in interactive mode the output of atop can be controlled by pressing particular keys.
However it is also possible to specify such key as flag on the command line. In that case atop switches
to the indicated mode on beforehand; this mode can be modified again interactively. Specifying such key
as flag is especially useful when running atop with output to a pipe or file (non-interactively). These
flags are the same as the keys that can be pressed in interactive mode (see section INTERACTIVE
COMMANDS).
Additional flags are available to support storage of atop-data in raw format (see section RAW DATA
STORAGE).

PROCESS ACCOUNTING

With every interval, atop reads the kernel administration to obtain information about all running
processes. However, it is likely that during the interval also processes have terminated. These
processes might have consumed system resources during this interval as well before they terminated.
Therefor, atop tries to read the process accounting records that contain the accounting information of
terminated processes and report these processes too. Only when the process accounting mechanism in the
kernel is activated, the kernel writes such process accounting record to a file for every process that
terminates.

There are various ways for atop to get access to the process accounting records (tried in this order):

1. When the environment variable ATOPACCT is set, it specifies the name of the process accounting file.
In that case, process accounting for this file should have been activated on beforehand. Before
opening this file for reading, atop drops its root privileges (if any).
When this environment variable is present but its contents is empty, process accounting will not be
used at all.

2. This is the preferred way of handling process accounting records!
When the atopacctd daemon is active, it has activated the process accounting mechanism in the kernel
and transfers to original accounting records to shadow files. In that case, atop drops its root
privileges and opens the current shadow file for reading.
This way is preferred, because the atopacctd daemon maintains full control of the sizes of the
original process accounting file (written by the kernel) and the shadow files (read by the atop
processes). For further information, refer to the atopacctd man page.

3. When the atopacctd daemon is not active, atop verifies if the process accounting mechanism has been
switched on via the separate psacct package. In that case, the file /var/account/pacct is in use as
process accounting file and atop opens this file for reading.

4. As a last possibility, atop itself tries to activate the process accounting mechanism (requires root
privileges) using the file /var/cache/atop.d/atop.acct (to be written by the kernel, to be read by
atop itself). Process accounting remains active as long as at least one atop process is alive.
Whenever the last atop process stops (either by pressing `q' or by `kill -15'), it deactivates the
process accounting mechanism again. Therefor you should never terminate atop by `kill -9', because
then it has no chance to stop process accounting. As a result, the accounting file may consume a lot
of disk space after a while.
To avoid that the process accounting file consumes too much disk space, atop verifies at the end of
every sample if the size of the process accounting file exceeds 200 MiB and if this atop process is
the only one that is currently using the file. In that case the file is truncated to a size of zero.

Notice that root-privileges are required to switch on/off process accounting in the kernel. You can
start atop as a root user or specify setuid-root privileges to the executable file. In the latter
case, atop switches on process accounting and drops the root-privileges again.
If atop does not run with root-privileges, it does not show information about finished processes. It
indicates this situation with the message message `no procacct` in the top-right corner (instead of
the counter that shows the number of exited processes).

When during one interval a lot of processes have finished, atop might grow tremendously in memory when
reading all process accounting records at the end of the interval. To avoid such excessive growth, atop
will never read more than 50 MiB with process information from the process accounting file per interval
(approx. 70000 finished processes). In interactive mode a warning is given whenever processes have been
skipped for this reason.

COLORS

For the resource consumption on system level, atop uses colors to indicate that a critical occupation
percentage has been (almost) reached. A critical occupation percentage means that is likely that this
load causes a noticeable negative performance influence for applications using this resource. The
critical percentage depends on the type of resource: e.g. the performance influence of a disk with a busy
percentage of 80% might be more noticeable for applications/user than a CPU with a busy percentage of
90%.
Currently atop uses the following default values to calculate a weighted percentage per resource:

Processor
A busy percentage of 90% or higher is considered `critical'.

Disk
A busy percentage of 70% or higher is considered `critical'.

Network
A busy percentage of 90% or higher for the load of an interface is considered `critical'.

Memory
An occupation percentage of 90% is considered `critical'. Notice that this occupation percentage is
the accumulated memory consumption of the kernel (including slab) and all processes; the memory for
the page cache (`cache' and `buff' in the MEM-line) and the reclaimable part of the slab (`slrec`)
is not implied!
If the number of pages swapped out (`swout' in the PAG-line) is larger than 10 per second, the
memory resource is considered `critical'. A value of at least 1 per second is considered `almost
critical'.
If the committed virtual memory exceeds the limit (`vmcom' and `vmlim' in the SWP-line), the SWP-
line is colored due to overcommitting the system.

Swap
An occupation percentage of 80% is considered `critical' because swap space might be completely
exhausted in the near future; it is not critical from a performance point-of-view.

These default values can be modified in the configuration file (see separate man-page of atoprc).

When a resource exceeds its critical occupation percentage, the concerning values in the screen line are
colored red by default.
When a resource exceeded (default) 80% of its critical percentage (so it is almost critical), the
concerning values in the screen line are colored cyan by default. This `almost critical percentage' (one
value for all resources) can be modified in the configuration file (see separate man-page of atoprc).
The default colors red and cyan can be modified in the configuration file as well (see separate man-page
of atoprc).

With the key 'x' (or flag -x), the use of colors can be suppressed.

NETATOP MODULE

       Per-process and per-thread network activity can be  measured  by  the  netatop  kernel  module.  You  can
       download this kernel module from the website (mentioned at the end of this manual page) and install it on
       your system if the kernel version is 2.6.24 or newer.
       When  atop gathers counters for a new interval, it verifies if the netatop module is currently active. If
       so, atop obtains the relevant network counters from this module and shows the number of sent and received
       packets per process/thread in the generic screen. Besides, detailed counters can be requested by pressing
       the `n' key.
       When the netatopd daemon is running as well, atop also reads the network  counters  of  exited  processes
       that are logged by this daemon (comparable with process accounting).

       More  information  about  the  optional netatop kernel module and the netatopd daemon can be found in the
       concerning man-pages and on the website mentioned at the end of this manual page.

GPU STATISTICS GATHERING

       GPU statistics can be gathered by atopgpud which is  a  separate  data  collection  daemon  process.   It
       gathers  cumulative  utilization  counters  of  every  Nvidia  GPU  in the system, as well as utilization
       counters of every process that uses a GPU.  When atop notices that the daemon is active, it  reads  these
       GPU utilization counters with every interval.

       The  atopgpud  daemon  is  written  in  Python, so a Python interpreter should be installed on the target
       system. The Python code of the daemon is compatible with  Python  version  2  and  version  3.   For  the
       gathering  of  the  statistics,  the  pynvml  module  is  used by the daemon. Be sure that this module is
       installed on the target system before activating the daemon, by running the  command  as  root  pip  (the
       command pip might be exchanged by pip3 in case of Python3):

         pip install nvidia-ml-py

       The  atopgpud  daemon  is  installed  by default as part of the atop package, but it is not automatically
       enabled.  The daemon can be enabled and started now by running the following commands (as root):

         systemctl enable atopgpu
         systemctl start atopgpu

       Find a description about the utilization counters in the section OUTPUT DESCRIPTION.

INTERACTIVE COMMANDS

When running atop interactively (no output redirection), keys can be pressed to control the output. In
general, lower case keys can be used to show other information for the active processes and upper case
keys can be used to influence the sort order of the active process/thread list.

g Show generic output (default).

Per process the following fields are shown in case of a window-width of 80 positions: process-id,
cpu consumption during the last interval in system and user mode, the virtual and resident memory
growth of the process.

The subsequent columns depend on the used kernel:
When the kernel supports "storage accounting" (>= 2.6.20), the data transfer for read/write on disk,
the status and exit code are shown for each process. When the kernel does not support "storage
accounting", the username, number of threads in the thread group, the status and exit code are
shown.
When the kernel module 'netatop' is loaded, the data transfer for send/receive of network packets is
shown for each process.
The last columns contain the state, the occupation percentage for the chosen resource (default: cpu)
and the process name.