xenial (1) condor_checkpoint.1.gz

Provided by: htcondor_8.4.2~dfsg.1-1build1_amd64 bug

Name

       condor_checkpoint send - a checkpoint command to jobs running on specified hosts

Synopsis

       condor_checkpoint [-help -version]

       condor_checkpoint[-debug]   [-pool   centralmanagerhostname[:portnumber]]   [-name  hostnamehostname-addr
       "<a.b.c.d:port>""<a.b.c.d:port>"-constraint expression-all]

Description

       condor_checkpoint sends a checkpoint command to a set of machines within a single pool. This  causes  the
       startd daemon on each of the specified machines to take a checkpoint of any running job that is executing
       under the standard universe. The job is temporarily stopped, a checkpoint is  taken,  and  then  the  job
       continues.  If  no  machine  is  specified,  then  the  command  is  sent  to the machine that issued the
       condor_checkpoint command.

       The command sent is a periodic checkpoint. The job  will  take  a  checkpoint,  but  then  the  job  will
       immediately  continue  running  after the checkpoint is completed. condor_vacate, on the other hand, will
       result in the job exiting (vacating) after it produces a checkpoint.

       If the job being checkpointed is running under the standard universe, the job produces a  checkpoint  and
       then  continues running on the same machine. If the job is running under another universe, or if there is
       currently no HTCondor job running on that host, then condor_checkpointhas no effect.

       There is generally no need for the user or administrator  to  explicitly  run  condor_checkpoint.  Taking
       checkpoints  of  running  HTCondor  jobs  is  handled  automatically following the policies stated in the
       configuration files.

Options

       -help

          Display usage information

       -version

          Display version information

       -debug

          Causes debugging information to be sent to  stderr , based on the value of the configuration  variable
          TOOL_DEBUG

       -pool centralmanagerhostname[:portnumber]

          Specify a pool by giving the central manager's host name and an optional port number

       -name hostname

          Send the command to a machine identified by hostname

       hostname

          Send the command to a machine identified by hostname

       -addr <a.b.c.d:port>

          Send the command to a machine's master located at "<a.b.c.d:port>"

       <a.b.c.d:port>

          Send the command to a machine located at "<a.b.c.d:port>"

       -constraint expression

          Apply this command only to machines matching the given ClassAd expression

       -all

          Send the command to all machines in the pool

Exit Status

       condor_checkpointwill  exit with a status value of 0 (zero) upon success, and it will exit with the value
       1 (one) upon failure.

Examples

       To send a condor_checkpoint command to two named machines:

       % condor_checkpoint   robin cardinal

       To send the condor_checkpointcommand to a machine within a pool of machines other than  the  local  pool,
       use  the -pooloption. The argument is the name of the central manager for the pool. Note that one or more
       machines within the pool must be specified as the targets for the command. This command sends the command
       to a the single machine named cae17within the pool of machines that has condor.cae.wisc.eduas its central
       manager:

       % condor_checkpoint  -pool condor.cae.wisc.edu -name cae17

Author

       Center for High Throughput Computing, University of Wisconsin-Madison

       Copyright (C) 1990-2015 Center for High Throughput Computing, Computer Sciences Department, University of
       Wisconsin-Madison, Madison, WI. All Rights Reserved. Licensed under the Apache License, Version 2.0.

                                                  February 2016                             condor_checkpoint(1)