oracular (7) orte_snapc.7.gz

Provided by: openmpi-doc_4.1.6-13.3ubuntu2_all bug

NAME

       ORTE_SNAPC  - Open RTE MCA Snapshot Coordination (SnapC) Framework: Overview of Open RTE's
       SnapC framework, and selected modules.  Open MPI 4.1.6

DESCRIPTION

       Open RTE can coordinate the generation of a global snapshot of a parallel  job  from  many
       distributed  local  snapshots. The components in this framework determine how to: Initiate
       the checkpoint of the parallel application, gather together  the  many  distributed  local
       snapshots,  and  provide the user with a global snapshot handle reference that can be used
       to restart the parallel application.

GENERAL PROCESS REQUIREMENTS

       In order for a process to use the Open RTE SnapC  components  it  must  adhear  to  a  few
       programmatic requirements.

       First,  the program must call ORTE_INIT early in its execution. This should only be called
       once, and it is not possible to checkpoint the process without it first having called this
       function.

       The program must call ORTE_FINALIZE before termination.

       A user may initiate a checkpoint of a parallel application by using the orte-checkpoint(1)
       and orte-restart(1) commands.

AVAILABLE COMPONENTS

       Open RTE ships with one SnapC component: full.

       The following MCA parameters apply to all components:

       snapc_base_verbose
           Set the verbosity level for all components. Default is 0, or silent except on error.

   full SnapC Component
       The full component gathers together the local snapshots to the disk local to the Head Node
       Process  (HNP)  before  completing  the checkpoint of the process. This component does not
       currently support replicated HNPs, or timer based gathering of local snapshot  data.  This
       is a 3-tiered hierarchy of coordinators.

       The full component has the following MCA parameters:

       snapc_full_priority
           The  component's  priority  to use when selecting the most appropriate component for a
           run.

       snapc_full_verbose
           Set the verbosity level for this component. Default is 0, or silent except on error.

   none SnapC Component
       The none component simply selects no SnapC component. All  of  the  SnapC  function  calls
       return immediately with ORTE_SUCCESS.

       This component is the last component to be selected by default. This means that if another
       component is available, and the none component was not explicity requested then ORTE  will
       attempt to activate all of the available components before falling back to this component.

SEE ALSO

         orte-checkpoint(1), orte-restart(1), opal-checkpoint(1), opal-restart(1), orte_filem(7),
       opal_crs(7)