Ubuntu Manpage: dggath, dgscat, gscat - convert distributed source graphs to or from centralized ones

Provided by: ptscotch_7.0.5-1ubuntu1_amd64

NAME

       dggath, dgscat, gscat - convert distributed source graphs to or from centralized ones

SYNOPSIS

       dggath [options] [igfile] [ogfile]

       dgscat [options] [igfile] [ogfile]

       gscat [options] [igfile] [ogfile]

DESCRIPTION

The dggath program gathers distributed graphs into centralized graphs. It reads a set of
files igfile representing fragments of a distributed source graph, and writes them back on
the form of a single centralized source graph ogfile.

The dgscat program scatters centralized source graphs into distributed graphs. It reads a
centralized source graph igfile and writes it back on the form of a set of files ogfile
representing fragments of the corresponding distributed source graph.

The gscat program does exactly the same as dgscat, but does not require to be run in a
parallel environment. Since gscat processes the input centralized graph file as a text
stream, it does not need to load the full graph in memory before building the distributed
graph fragment files. It is therefore much less resource consuming, but does not allow for
the checking of graph consistency, as it has no global vision of the graph structure.

When file names are not specified, data is read from standard input and written to
standard output. Standard streams can also be explicitly represented by a dash '-'.

When the proper libraries have been included at compile time, dggath and dgscat can
directly handle compressed graphs, both as input and output. A stream is treated as
compressed whenever its name is postfixed with a compressed file extension, such as in
'brol.grf.bz2' or '-.gz'. The compression formats which can be supported are the bzip2
format ('.bz2'), the gzip format ('.gz'), and the lzma format ('.lzma').

dggath and dgscat base on implementations of the MPI interface to spread work across the
processing elements. It is therefore not likely to be run directly, but instead through
some launcher command such as mpirun.

DISTRIBUTED FILE NAMES

In order to tell whether programs should read from, or write to, a single file located on
only one processor, or to multiple instances of the same file on all of the processors, or
else to distinct files on each of the processors, a special grammar has been designed,
which is based on the '%' escape character. Four such escape sequences are defined, which
are interpreted independently on every processor, prior to file opening. By default, when
a filename is provided, it is assumed that the file is to be opened on only one of the
processors, called the root processor, which is usually process 0 of the communicator
within which the program is run. The index of the root processor can be changed by means
of the -r option. Using any of the first three escape sequences below will instruct
programs to open in parallel a file of name equal to the interpreted filename, on every
processor on which they are run.

%p Replaced by the number of processes in the global communicator in which the program
is run. Leads to parallel opening.

%r Replaced on each process running the program by the rank of this process in the
global communicator. Leads to parallel opening.

%- Discarded, but leads to parallel opening. This sequence is mainly used to instruct
programs to open on every processor a file of identical name. The opened files can
be, according whether the given path leads to a shared directory or to directories
that are local to each processor, either to the opening of multiple instances of
the same file, or to the opening of distinct files which may each have a different
content, respectively (but in this latter case it is much recommended to identify
files by means of the '%r' sequence).

%% Replaced by a single '%' character. File names using this escape sequence are not
considered for parallel opening, unless one or several of the three other escape
sequences are also present.

For instance, filename 'brol' will lead to the opening of file 'brol' on the root
processor only, filename '%-brol' (or even 'br%-ol') will lead to the parallel opening of
files called 'brol' on every processor, and filename 'brol%p-%r' will lead to the opening
of files ’brol2-0' and 'brol2-1', respectively, on each of the two processors on which the
program were to run.

OPTIONS

       -c     For dggath and dgscat only. Check the consistency of the input source  graph  after
              loading it into memory.

       -h     Display some help.

       -inbr  For  gscat  only.  Create an imbalanced distributed graph, in which the distributed
              graph files for all processes will receive extit{nbr} vertices of the  graph,  save
              for the file for the last process, which will receive all the remaining vertices.

       -rpnum Set root process for centralized files (default is 0).

       -V     Display program version and copyright.

EXAMPLE

       Run  dgscat  on  5  processing  elements to scatter centralized graph file brol.grf into 5
       gzipped file fragments brol5-0.dgr.gz to brol5-4.dgr.gz.

           $ mpirun -np 5 dgscat brol.grf brol%p-%r.dgr.gz

AUTHOR

       Francois Pellegrini <francois.pellegrini@labri.fr>

                                          05 August 2023                                dgscat(1)