Ubuntu Manpage: pcp-atop - Advanced System and Process Monitor

NAME

       pcp-atop - Advanced System and Process Monitor

SYNOPSIS

       Interactive Usage:

       pcp  [pcp options]  atop  [-aABcCdDfFgGHmMnNopRsuvxyY1]  [-L linelen] [-Plabel[,label]... [-Z]] [interval
       [samples]]

       Writing and reading PCP archive folios:

       pcp atop -w folio [-a] [-S] [interval [samples]]
       pcp atop -r folio [-AcCdDfFgGmMnNopRsuvxy1] [-b [yy-mm-dd]  hh:mm]  [-e  yy-mm-dd]  hh:mm]  [-L  linelen]
       [-Plabel[,label]... [-Z]] [interval [samples]]

DESCRIPTION

       The  program  pcp-atop  is  an  interactive  monitor  to view various aspects of load on a system.  Every
       interval seconds (default: 10 seconds) information is gathered about the resource  occupation  on  system
       level  of  the  most  critical  hardware resources (from a performance point of view), i.e. CPUs, memory,
       disks and network interfaces. Besides, information is gathered about the processes (or threads) that  are
       responsible  for  the  utilization of the CPUs, memory and disks.  Network load per process is shown only
       when the optional pmdabpf(1) or pmdabcc(1) metrics have been installed and configured.

BAR GRAPH MODE

When running pcp-atop you can choose to view the system load in bar graph mode or in text mode. In bar
graph mode the resource utilization of CPUs, memory, disks and network interfaces is shown via
(character-based) bar graphs, but only on system level. When you want to view more detailed information
on system level or when you want to view the resource consumption on process or thread level, you can
switch to text mode by pressing the 'B' key. Alternatively, you can use the 'B' key (again) to switch
from text mode to bar graph mode.
By default, pcp-atop starts in text mode unless the -B flag is used or unless 'B' has been configured as
a default flag in the .atoprc file (for further information about default flags, refer to the pcp-
atoprc(5) man page).

In bar graph mode the terminal will be subdivided into four character-based windows, i.e. one window for
each hardware resource:

Processors
The first bar shows the average busy percentage of all CPUs with the bar label 'Avg' (might be
abbreviated to 'Av' or even just 'A'). The subsequent bars show the busy percentages of single
CPUs.
When there is not enough horizontal space to show all CPUs, only the most busy CPUs per sample will
be shown after the width of each bar has been reduced to a minimum.

By default, the categories of CPU consumption are shown by different colors in the bars, marked with
a character 'S' (system mode), 'U' (user mode), 'I' (interrupt handling), 's' (steal) and 'G'
(guest, i.e. consumed by virtual machines).
The top of the bar might consist of an unmarked color representing a 'neutral' category. Suppose
that the scale unit is 5% per line and the total busy percentage is 54% consisting of two categories
of 27%. The two categories will be rounded to 25% (5 lines of 5% each) but the total busy
percentage will be rounded to 55% (11 lines of 5%). Then the top line will represent a 'neutral'
category.
By pressing the 'H' key or by starting pcp-atop with the -H flag, no categories are shown.

A red line is drawn in the bar graph as critical threshold. By default this value is 90% and can be
modified by the 'cpucritperc' option in the configuration file (see separate pcp-atoprc(5) man
page). When this value is set to zero, no threshold line will be drawn.

Memory and swap space
Memory is presented as a column in which the specific categories of memory consumption are shown.
These categories are (code, data and stack of) processes/kernel, slab caches (i.e. dynamically
allocated kernel memory), shared memory, tmpfs, static huge pages, page cache and free memory.
Swap space (if present) is also presented as a column in which the categories processes/tmpfs,
shared memory and free space are shown.

At the right side memory-related event counters are shown.
The bottom three counters are colored green when there is no memory pressure. When considerable
activity is noticed such counter might be colored orange and with high activity red.
When memory pressure starts, usually memory page scanning will be activated first. When pressure
increases, memory pages of processes might be swapped out to swap space (if present).
The 'oomkills' counter (Out Of Memory killing) is most serious: it reflects the number of processes
that are killed due to lack of memory (and swap). Therefore this counter shows the absolute number
(not per second) of processes being killed during the last interval and will immediately be colored
red when it is 1 or more. Besides, after pcp-atop has noticed OOM killing the 'oomkills' counter
remains orange for the next 15 minutes, just in case that you have missed the OOM killing event
itself.
When there is enough vertical space in the memory window, event counters are shown about the number
of memory pages being swapped in, the number of memory pages paged out to block devices and the
number of memory pages paged in from block devices.

Memory and swap space consumption will preferably be shown in a character-based window that
vertically uses the entire screen for optimal granularity. However, when there are a lot of disks
and/or network interfaces the memory and swap space consumption will be shown in a character-based
window that only uses the upper half of the screen.

Disks
For each disk the busy percentage is shown as a bar.
When there is not enough horizontal space to show all disks, only the most busy disks per sample
will be shown.

By default, categories of disk consumption are shown by different colors in the bars, marked with a
character 'R' (read) and 'W' (write).
The top of the bar might consist of an unmarked color representing a 'neutral' category. Suppose
that the scale unit is 5% per line and the total busy percentage is 54% consisting of two categories
of 27%. The two categories will be rounded to 25% (5 lines of 5% each) but the total busy
percentage will be rounded to 55% (11 lines of 5%). Then the top line will represent a 'neutral'
category.
By pressing the 'H' key or by starting pcp-atop with the -H flag, no categories are shown.

A red line is drawn in the bar graph as critical threshold. By default this value is 90% and can be
modified by the 'dskcritperc' option in the configuration file (see separate atoprc man page). When
this value is set to zero, no threshold line will be drawn.

Interfaces
For each non-virtual network interface a double bar graph is shown with a dedicated scale that
reflects the traffic rate. One of the bars shows the transmit rate ('TX') and the other bar the
receive rate ('RX'). The traffic scale of each network interface remains at its highest level. All
interface scales can be reset during the measurement by pressing the 'L' key.

Most often the real speed (maximum bandwidth) of network interfaces is not known, e.g. in case of
the network interfaces of virtual machines. Therefore it is not possible to show the interface
utilization as a percentage. However, when the real speed of an interface is known it will be shown
underneath the concerning bar graph.

When there is not enough horizontal space to show all network interfaces, only the most busy
interfaces per sample will be shown.

Usually the bar graphs will not be sorted on busy percentage when there is enough horizontal space.
However, after switching from text mode to bar graph mode the bar graphs might have been sorted because
this was needed for the presentation in text mode. The next interval in bar graph mode shows the bars
unsorted again unless the window width is insufficient for all bars.

The remaining part of this manual page mainly describes the information shown in text mode. When certain
descriptions also apply to bar graph mode it will be mentioned explicitly.

TEXT MODE IN GENERAL

The initial screen in text mode shows if pcp-atop runs with restricted view (unprivileged user) or
unrestricted view (privileged user). In case of restricted view pcp-atop does not have the privileges
(root identity or necessary capabilities) to retrieve all counter values on system level and on process
level. does not have the privileges (no root identity nor the necessary capabilities) to retrieve all
counter values on system level and on process level.

With every interval information is shown about the resource occupation on system level (CPU, memory,
disks and network layers), followed by a list of processes which have been active during the last
interval. Notice that all processes that were unchanged during the last interval re not shown, unless
the key 'a' has been pressed or unless sorting on memory occupation is done (then inactive processes are
relevant as well). If the list of active processes does not entirely fit on the screen, only the top of
the list is shown (sorted in order of activity).
The intervals are repeated till the number of samples (specified as command argument) is reached, or till
the key 'q' is pressed in interactive mode.

When invoked via the pcp(1) command, the PCPIntro(1) options -A/--align, -a/--archive, -h/--host,
-O/--origin, -S/--start, -s/--samples, -T/--finish, -t/--interval, -v/--version, -z/--hostzone and
-z/--timezone become indirectly available. Additionally, the --hotproc option can be used to request the
per-process PCP metrics be used instead of the default proc metrics from pmdaproc(1).

When pcp-atop is started, it checks whether the standard output channel is connected to a screen, or to a
file/pipe. In the first case it produces screen control codes (via the ncurses library) and behaves in‐
teractively; in the second case it produces flat text output.

In interactive mode, the output of pcp-atop scales dynamically to the current dimensions of the
screen/window.
If the window is resized horizontally, columns will be added or removed automatically. For this purpose,
every column has a particular weight. The columns with the highest weights that fit within the current
width will be shown.
If the window is resized vertically, lines of the process/thread list will be added or removed automati‐
cally.

In interactive mode the output of pcp-atop can be controlled by pressing particular keys. However it is
also possible to specify such key as flag on the command line. In that case pcp-atop switches to the in‐
dicated mode on beforehand. This mode can be modified again interactively. Specifying such key as flag
is especially useful when running pcp-atop with output to a pipe or file (non-interactively). These
flags are the same as the keys that can be pressed in interactive mode (see section INTERACTIVE COM‐
MANDS).
Additional flags are available to support storage of pcp-atop data in PCP archive format (see section PCP
DATA STORAGE).

COLORS

For the resource consumption on system level, pcp-atop uses colors in text mode to indicate that a criti‐
cal occupation percentage has been (almost) reached. A critical occupation percentage means that is
likely that this load causes a noticeable negative performance influence for applications using this re‐
source. The critical percentage depends on the type of resource: e.g. the performance influence of a
disk with a busy percentage of 80% might be more noticeable for applications/users than a CPU with a busy
percentage of 90%.
Currently pcp-atop uses the following default values to calculate a weighted percentage per resource:

Processor
A busy percentage of 90% or higher is considered 'critical' (also in bar graph mode).

Disk
A busy percentage of 90% or higher is considered 'critical'.

Network
A busy percentage of 90% or higher for the load of an interface is considered 'critical'.

Memory
An occupation percentage of 90% is considered 'critical'. Notice that this occupation percentage is
the accumulated memory consumption of the kernel (including slab) and all processes. The memory for
the page cache ('cache' and 'buff' in the MEM-line) and the reclaimable part of the slab ('slrec')
is not implied!
If the number of pages swapped out ('swout' in the PAG-line) is larger than 10 per second, the memo‐
ry resource is considered 'critical'. A value of at least 1 per second is considered 'almost criti‐
cal'.
If the committed virtual memory exceeds the limit ('vmcom' and 'vmlim' in the SWP-line), the SWP-
line is colored due to overcommitting the system.

Swap
An occupation percentage of 80% is considered 'critical' because swap space might be completely ex‐
hausted in the near future. It is not critical from a performance point-of-view.

These default values can be modified in the configuration file (see separate pcp-atoprc(5) man page).

When a resource exceeds its critical occupation percentage, the concerning values in the screen line are
colored red by default.
When a resource exceeds (by default) 80% of its critical percentage (so it is almost critical), the con‐
cerning values in the screen line are colored cyan by default. This 'almost critical percentage' (one
value for all resources) can be modified in the configuration file (see separate pcp-atoprc(5) man page).
The default colors red and cyan can be modified in the configuration file as well (see separate man-page
of pcp-atoprc(5)).

With the key 'x' (or flag -x), the use of colors can be suppressed in text mode. The use of colors is
however mandatory in case of bar graph mode.

NETATOP BPF MODULE

       Per-process and per-thread network activity can be measured by the netatop BPF module that can  be  sepa‐
       rately installed with pmdabpf(1).  or pmdabcc(1).
       When pcp-atop gathers counters for a new interval, it verifies if the eBPF module is currently active. If
       so,  pcp-atop obtains the relevant network counters from this module and shows the number of sent and re‐
       ceived packets per process/thread in the generic screen. Besides, detailed counters can be  requested  by
       pressing the 'n' key.

GPU STATISTICS GATHERING

       GPU  statistics  can be gathered by pmdanvidia(1) which is a separate data collection daemon process.  It
       gathers cumulative utilization counters of every Nvidia GPU in the system, as well as  utilization  coun‐
       ters  of  every process that uses a GPU.  When pcp-atop notices that the daemon is active, it reads these
       GPU utilization counters with every interval.

       Find a description about the utilization counters in the section OUTPUT DESCRIPTION.

INTERACTIVE COMMANDS

When running pcp-atop interactively (no output redirection), keys can be pressed to control the output.
In general, lower case keys can be used to show other information for the active processes while certain
upper case keys can be used to influence the sort order of the active process/thread list. Some of these
keys can also be used to switch from bar graph mode to particular detailed process information in text
mode.

g Show generic output (default).

Per process the following fields are shown in case of a window-width of 80 positions: process-id,
CPU consumption during the last interval in system and user mode, the virtual and resident memory
growth of the process.
The data transfer per process for read/write on disk can only be shown when pcp-atop accesses met‐
rics with root privileges.
When the optional pmdabpf(1) or pmdabcc(1) module netatop is loaded, the data transfer for send/re‐
ceive of network packets is shown for each process.
The last columns contain the state, the occupation percentage for the chosen resource (default: CPU)
and the process name.