Provided by: htcondor_8.4.2~dfsg.1-1build1_amd64 

Name
condor_dagman meta - scheduler of the jobs submitted as the nodes of a DAG or DAGs
Synopsis
condor_dagman -f-t-l .-help
condor_dagman-version
condor_dagman-f-l .-csdversion version_string[-debug level] [-maxidle numberOfProcs] [-maxjobs
numberOfJobs] [-maxpre NumberOfPreScripts] [-maxpost NumberOfPostScripts] [-noeventchecks]
[-allowlogerror] [-usedagdir] -lockfile filename[-waitfordebug] [-autorescue 0|1] [-dorescuefrom number]
[-allowversionmismatch] [-DumpRescue] [-verbose] [-force] [-notification value] [-suppress_notification]
[-dont_suppress_notification] [-dagman DagmanExecutable] [-outfile_dir directory] [-update_submit]
[-import_env] [-priority number] [-dont_use_default_node_log] [-DontAlwaysRunPost] [-DoRecovery] -dag
dag_file[-dag dag_file_2... -dag dag_file_n]
Description
condor_dagman is a meta scheduler for the HTCondor jobs within a DAG (directed acyclic graph) (or
multiple DAGs). In typical usage, a submitter of jobs that are organized into a DAG submits the DAG using
condor_submit_dag. condor_submit_dagdoes error checking on aspects of the DAG and then submits
condor_dagman as an HTCondor job. condor_dagman uses log files to coordinate the further submission of
the jobs within the DAG.
All command line arguments to the DaemonCorelibrary functions work for condor_dagman. When invoked from
the command line, condor_dagmanrequires the arguments -f -l .to appear first on the command line, to be
processed by DaemonCore. The csdversionmust also be specified; at start up, condor_dagmanchecks for a
version mismatch with the condor_submit_dagversion in this argument. The -targument must also be present
for the -helpoption, such that output is sent to the terminal.
Arguments to condor_dagmanare either automatically set by condor_submit_dagor they are specified as
command-line arguments to condor_submit_dagand passed on to condor_dagman. The method by which the
arguments are set is given in their description below.
condor_dagmancan run multiple, independent DAGs. This is done by specifying multiple -dag arguments. Pass
multiple DAG input files as command-line arguments to condor_submit_dag.
Debugging output may be obtained by using the -debug leveloption. Level values and what they produce is
described as
* level = 0; never produce output, except for usage info
* level = 1; very quiet, output severe errors
* level = 2; normal output, errors and warnings
* level = 3; output errors, as well as all warnings
* level = 4; internal debugging output
* level = 5; internal debugging output; outer loop debugging
* level = 6; internal debugging output; inner loop debugging; output DAG input file lines as they are
parsed
* level = 7; internal debugging output; rarely used; output DAG input file lines as they are parsed
Options
-help
Display usage information and exit.
-version
Display version information and exit.
-debug level
An integer level of debugging output. levelis an integer, with values of 0-7 inclusive, where 7 is the
most verbose output. This command-line option to condor_submit_dagis passed to condor_dagman or
defaults to the value 3.
-maxidle NumberOfProcs
Sets the maximum number of idle procs allowed before condor_dagman stops submitting more node jobs.
Note that for this argument, each individual proc within a cluster counts as a towards the limit,
which is inconsistent with -maxjobs.Once idle procs start to run, condor_dagman will resume submitting
jobs once the number of idle procs falls below the specified limit. NumberOfProcsis a non-negative
integer. If this option is omitted, the number of idle procs is limited by the configuration variable
DAGMAN_MAX_JOBS_IDLE (see 3.3.24), which defaults to 1000. To disable this limit, set NumberOfProcsto
0. Note that submit description files that queue multiple procs can cause the NumberOfProcslimit to be
exceeded. Setting queue 5000 in the submit description file, where -maxidleis set to 250 will result
in a cluster of 5000 new procs being submitted to the condor_schedd, not 250. In this case,
condor_dagman will resume submitting jobs when the number of idle procs falls below 250.
-maxjobs NumberOfClusters
Sets the maximum number of clusters within the DAG that will be submitted to HTCondor at one time.
Note that for this argument, each cluster counts as one job, no matter how many individual procs are
in the cluster. NumberOfClustersis a non-negative integer. If this option is omitted, the number of
clusters is limited by the configuration variable DAGMAN_MAX_JOBS_SUBMITTED (see 3.3.24), which
defaults to 0 (unlimited).
-maxpre NumberOfPreScripts
Sets the maximum number of PRE scripts within the DAG that may be running at one time.
NumberOfPreScriptsis a non-negative integer. If this option is omitted, the number of PRE scripts is
limited by the configuration variable DAGMAN_MAX_PRE_SCRIPTS (see 3.3.24), which defaults to 20.
-maxpost NumberOfPostScripts
Sets the maximum number of POST scripts within the DAG that may be running at one time.
NumberOfPostScriptsis a non-negative integer. If this option is omitted, the number of POST scripts is
limited by the configuration variable DAGMAN_MAX_POST_SCRIPTS (see 3.3.24), which defaults to 20.
-noeventchecks
This argument is no longer used; it is now ignored. Its functionality is now implemented by the
DAGMAN_ALLOW_EVENTS configuration variable.
-allowlogerror
This optional argument has condor_dagman try to run the specified DAG, even in the case of detected
errors in the job event log specification. As of version 7.3.2, this argument has an effect only on
DAGs containing Stork job nodes.
-usedagdir
This optional argument causes condor_dagman to run each specified DAG as if the directory containing
that DAG file was the current working directory. This option is most useful when running multiple DAGs
in a single condor_dagman .
-lockfile filename
Names the file created and used as a lock file. The lock file prevents execution of two of the same
DAG, as defined by a DAG input file. A default lock file ending with the suffix .dag.lock is passed
to condor_dagman by condor_submit_dag.
-waitfordebug
This optional argument causes condor_dagman to wait at startup until someone attaches to the process
with a debugger and sets the wait_for_debug variable in main_init() to false.
-autorescue 0|1
Whether to automatically run the newest rescue DAG for the given DAG file, if one exists (0 = false ,
1 = true ).
-dorescuefrom number
Forces condor_dagman to run the specified rescue DAG number for the given DAG. A value of 0 is the
same as not specifying this option. Specifying a nonexistent rescue DAG is a fatal error.
-allowversionmismatch
This optional argument causes condor_dagman to allow a version mismatch between condor_dagman itself
and the .condor.sub file produced by condor_submit_dag(or, in other words, between
condor_submit_dagand condor_dagman ). WARNING! This option should be used only if absolutely
necessary. Allowing version mismatches can cause subtle problems when running DAGs. (Note that,
starting with version 7.4.0, condor_dagman no longer requires an exact version match between itself
and the .condor.sub file. Instead, a "minimum compatible version" is defined, and any .condor.sub
file of that version or newer is accepted.)
-DumpRescue
This optional argument causes condor_dagman to immediately dump a Rescue DAG and then exit, as opposed
to actually running the DAG. This feature is mainly intended for testing. The Rescue DAG file is
produced whether or not there are parse errors reading the original DAG input file. The name of the
file differs if there was a parse error.
-verbose
(This argument is included only to be passed to condor_submit_dagif lazy submit file generation is
used for nested DAGs.) Cause condor_submit_dagto give verbose error messages.
-force
(This argument is included only to be passed to condor_submit_dagif lazy submit file generation is
used for nested DAGs.) Require condor_submit_dagto overwrite the files that it produces, if the files
already exist. Note that dagman.out will be appended to, not overwritten. If new-style rescue DAG
mode is in effect, and any new-style rescue DAGs exist, the -forceflag will cause them to be renamed,
and the original DAG will be run. If old-style rescue DAG mode is in effect, any existing old-style
rescue DAGs will be deleted, and the original DAG will be run. See the HTCondor manual section on
Rescue DAGs for more information.
-notification value
This argument is only included to be passed to condor_submit_dagif lazy submit file generation is used
for nested DAGs. Sets the e-mail notification for DAGMan itself. This information will be used within
the HTCondor submit description file for DAGMan. This file is produced by condor_submit_dag. The
notificationoption is described in the condor_submitmanual page.
-suppress_notification
Causes jobs submitted by condor_dagman to not send email notification for events. The same effect can
be achieved by setting the configuration variable DAGMAN_SUPPRESS_NOTIFICATION to True . This
command line option is independent of the -notificationcommand line option, which controls
notification for the condor_dagman job itself. This flag is generally superfluous, as
DAGMAN_SUPPRESS_NOTIFICATION defaults to True .
-dont_suppress_notification
Causes jobs submitted by condor_dagman to defer to content within the submit description file when
deciding to send email notification for events. The same effect can be achieved by setting the
configuration variable DAGMAN_SUPPRESS_NOTIFICATION to False . This command line flag is independent
of the -notificationcommand line option, which controls notification for the condor_dagman job itself.
If both -dont_suppress_notificationand -suppress_notificationare specified within the same command
line, the last argument is used.
-dagman DagmanExecutable
(This argument is included only to be passed to condor_submit_dagif lazy submit file generation is
used for nested DAGs.) Allows the specification of an alternate condor_dagman executable to be used
instead of the one found in the user's path. This must be a fully qualified path.
-outfile_dir directory
(This argument is included only to be passed to condor_submit_dagif lazy submit file generation is
used for nested DAGs.) Specifies the directory in which the .dagman.out file will be written. The
directorymay be specified relative to the current working directory as condor_submit_dagis executed,
or specified with an absolute path. Without this option, the .dagman.out file is placed in the same
directory as the first DAG input file listed on the command line.
-update_submit
(This argument is included only to be passed to condor_submit_dagif lazy submit file generation is
used for nested DAGs.) This optional argument causes an existing .condor.sub file to not be treated
as an error; rather, the .condor.sub file will be overwritten, but the existing values of -maxjobs,
-maxidle, -maxpre, and -maxpostwill be preserved.
-import_env
(This argument is included only to be passed to condor_submit_dagif lazy submit file generation is
used for nested DAGs.) This optional argument causes condor_submit_dagto import the current
environment into the environmentcommand of the .condor.sub file it generates.
-priority number
Sets the minimum job priority of node jobs submitted and running under this condor_dagman job.
-dont_use_default_node_log
This option is disabled as of HTCondor version 8.3.1.Tells condor_dagman to use the file specified by
the job ClassAd attribute UserLog to monitor job status. If this command line argument is used, then
the job event log file cannot be defined with a macro.
-DontAlwaysRunPost
This option causes condor_dagman to observe the exit status of the PRE script when deciding whether or
not to run the POST script. Versions of condor_dagman previous to HTCondor version 7.7.2 would not run
the POST script if the PRE script exited with a nonzero status, but this default has been changed such
that the POST script will run, regardless of the exit status of the PRE script. Using this option
restores the previous behavior, in which condor_dagman will not run the POST script if the PRE script
fails.
-DoRecovery
Causes condor_dagman to start in recovery mode. This means that it reads the relevant job user log(s)
and catches up to the given DAG's previous state before submitting any new jobs.
-dag filename
filenameis the name of the DAG input file that is set as an argument to condor_submit_dag, and passed
to condor_dagman .
Exit Status
condor_dagmanwill exit with a status value of 0 (zero) upon success, and it will exit with the value 1
(one) upon failure.
Examples
condor_dagmanis normally not run directly, but submitted as an HTCondor job by running condor_submit_dag.
See the condor_submit_dag manual page for examples.
Author
Center for High Throughput Computing, University of Wisconsin-Madison
Copyright
Copyright (C) 1990-2015 Center for High Throughput Computing, Computer Sciences Department, University of
Wisconsin-Madison, Madison, WI. All Rights Reserved. Licensed under the Apache License, Version 2.0.
February 2016 condor_dagman(1)