Provided by: pegasus-wms_4.4.0+dfsg-5_amd64 

NAME
pegasus-monitord - tracks a workflow progress, mining information
SYNOPSIS
pegasus-monitord [--help|-help] [--verbose|-v]
[--adjust|-a i] [--foreground|-N]
[--no-daemon|-n] [--job|-j jobstate.log file]
[--log|-l logfile] [--conf properties file]
[--no-recursive] [--no-database | --no-events]
[--replay|-r] [--no-notifications]
[--notifications-max max_notifications]
[--notifications-timeout timeout]
[--sim|-s millisleep] [--db-stats]
[--skip-stdout] [--force|-f]
[--socket] [--output-dir | -o dir]
[--dest|-d PATH or URL] [--encoding|-e bp | bson]
DAGMan output file
DESCRIPTION
This program follows a workflow, parsing the output of DAGMAN’s dagman.out file. In addition to
generating the jobstate.log file, pegasus-monitord can also be used mine information from the workflow
dag file and jobs' submit and output files, and either populate a database or write a NetLogger events
file with that information. pegasus-monitord can also perform notifications when tracking a workflow’s
progress in real-time.
OPTIONS
-h, --help
Prints a usage summary with all the available command-line options.
-v, --verbose
Sets the log level for pegasus-monitord. If omitted, the default level will be set to WARNING. When
this option is given, the log level is changed to INFO. If this option is repeated, the log level
will be changed to DEBUG.
The log level in pegasus-monitord can also be adjusted interactively, by sending the USR1 and USR2
signals to the process, respectively for incrementing and decrementing the log level.
-a i, --adjust i
For adjusting time zone differences by i seconds, default is 0.
-N, --foreground
Do not daemonize pegasus-monitord, go through the motions as if (Condor).
-n, --no-daemon
Do not daemonize pegasus-monitord, keep it in the foreground (for debugging).
-j jobstate.log file, --job jobstate.log file
Alternative location for the jobstate.log file. The default is to write a jobstate.log in the
workflow directory. An absolute file name should only be used if the workflow does not have any
sub-workflows, as each sub-workflow will generate its own jobstate.log file. If an alternative,
non-absolute, filename is given with this option, pegasus-monitord will create one file in each
workflow (and sub-workflow) directory with the filename provided by the user with this option. If an
absolute filename is provided and sub-workflows are found, a warning message will be printed and
pegasus-monitord will not track any sub-workflows.
--log logfile, --log-file logfile
Specifies an alternative logfile to use instead of the monitord.log file in the main workflow
directory. Differently from the jobstate.log file above, pegasus-monitord only generates one logfile
per execution (and not one per workflow and sub-workflow it tracks).
--conf properties_file
is an alternative file containing properties in the key=value format, and allows users to override
values read from the braindump.txt file. This option has precedence over the properties file
specified in the braindump.txt file. Please note that these properties will apply not only to the
main workflow, but also to all sub-workflows found.
--no-recursive
This options disables pegasus-monitord to automatically follow any sub-workflows that are found.
--nodatabase, --no-database, --no-events
Turns off generating events (when this option is given, pegasus-monitord will only generate the
jobstate.log file). The default is to automatically log information to a SQLite database (see the
--dest option below for more details). This option overrides any parameter given by the --dest
option.
-r, --replay
This option is used to replay the output of an already finished workflow. It should only be used
after the workflow is finished (not necessarily successfully). If a jobstate.log file is found, it
will be rotated. However, when using a database, all previous references to that workflow (and all
its sub-workflows) will be erased from it. When outputing to a bp file, the file will be deleted.
When running in replay mode, pegasus-monitord will always run with the --no-daemon option, and any
errors will be output directly to the terminal. Also, pegasus-monitord will not process any
notifications while in replay mode.
--no-notifications
This options disables notifications completely, making pegasus-monitord ignore all the .notify files
for all workflows it tracks.
--notifications-max max_notifications
This option sets the maximum number of concurrent notifications that pegasus-monitord will start.
When the max_notifications limit is reached, pegasus-monitord will queue notifications and wait for a
pending notification script to finish before starting a new one. If max_notifications is set to 0,
notifications will be disabled.
--notifications-timeout timeout
Normally, pegasus-monitord will start a notification script and wait indefinitely for it to finish.
This option allows users to set up a maximum timeout that pegasus-monitord will wait for a
notification script to finish before terminating it. If notification scripts do not finish in a
reasonable amount of time, it can cause other notification scripts to be queued due to the maximum
number of concurrent scripts allowed by pegasus-monitord. Additionally, until all notification
scripts finish, pegasus-monitord will not terminate.
-s millisleep, --sim millisleep
This option simulates delays between reads, by sleeping millisleep milliseconds. This option is
mainly used by developers.
--db-stats
This option causes the database module to collect and print database statistics at the end of the
execution. It has no effect if the --no-database option is given.
--skip-stdout
This option causes pegasus-monitord not to populate jobs' stdout and stderr into the BP file or the
Stampede database. It should be used to avoid increasing the database size substantially in cases
where jobs are very verbose in their output.
-f, --force
This option causes pegasus-monitord to skip checking for another instance of itself already running
on the same workflow directory. The default behavior prevents two or more pegasus-monitord instances
from starting and running simultaneously (which would cause the bp file and database to be left in an
unstable state). This option should noly be used when the user knows the previous instance of
pegasus-monitord is NOT running anymore.
--socket
This option causes pegasus-monitord to start a socket interface that can be used for advanced
debugging. The port number for connecting to pegasus-monitord can be found in the monitord.sock file
in the workflow directory (the file is deleted when pegasus-monitord finishes). If not already
started, the socket interface is also created when pegasus-monitord receives a USR1 signal.
-o dir, --ouput-dir dir
When this option is given, pegasus-monitord will create all its output files in the directory
specified by dir. This option is useful for allowing a user to debug a workflow in a directory the
user does not have write permissions. In this case, all files generated by pegasus-monitord will have
the workflow wf_uuid as a prefix so that files from multiple sub-workflows can be placed in the same
directory. This option is mainly used by pegasus-analyzer. It is important to note that the location
for the output BP file or database is not changed by this option and should be set via the --dest
option.
-d URL params, --dest URL params
This option allows users to specify the destination for the log events generated by pegasus-monitord.
If this option is omitted, pegasus-monitord will create a SQLite database in the workflow’s run
directory with the same name as the workflow, but with a .stampede.db prefix. For an empty scheme,
params are a file path with - meaning standard output. For a x-tcp scheme, params are
TCP_host[:port=14380]. For a database scheme, params are a SQLAlchemy engine URL with a database
connection string that can be used to specify different database engines. Please see the examples
section below for more information on how to use this option. Note that when using a database engine
other than sqlite, the necessary Python database drivers will need to be installed.
-e encoding, --encoding encoding
This option specifies how to encode log events. The two available possibilities are bp and bson. If
this option is not specified, events will be generated in the bp format.
DAGMan_output_file
The DAGMan_output_file is the only requires command-line argument in pegasus-monitord and must have
the .dag.dagman.out extension.
RETURN VALUE
If the plan could be constructed, pegasus-monitord returns with an exit code of 0. However, in case of
error, a non-zero exit code indicates problems. In that case, the logfile should contain additional
information about the error condition.
ENVIRONMENT VARIABLES
pegasus-monitord does not require that any environmental variables be set. It locates its required Python
modules based on its own location, and therefore should not be moved outside of Pegasus' bin directory.
EXAMPLES
Usually, pegasus-monitord is invoked automatically by pegasus-run and tracks the workflow progress in
real-time, producing the jobstate.log file and a corresponding SQLite database. When a workflow fails,
and is re-submitted with a rescue DAG, pegasus-monitord will automatically pick up from where it left
previously and continue the jobstate.log file and the database.
If users need to create the jobstate.log file after a workflow is already finished, the --replay | -r
option should be used when running pegasus-monitord manually. For example:
$ pegasus_monitord -r diamond-0.dag.dagman.out
will launch pegasus-monitord in replay mode. In this case, if a jobstate.log file already exists, it will
be rotated and a new file will be created. If a diamond-0.stampede.db SQLite database already exists,
pegasus-monitord will purge all references to the workflow id specified in the braindump.txt file,
including all sub-workflows associated with that workflow id.
$ pegasus_monitord -r --no-database diamond-0.dag.dagman.out
will do the same thing, but without generating any log events.
$ pegasus_monitord -r --dest `pwd`/diamond-0.bp diamond-0.dag.dagman.out
will create the file diamond-0.bp in the current directory, containing NetLogger events with all the
workflow data. This is in addition to the jobstate.log file.
For using a database, users should provide a database connection string in the format of:
dialect://username:password@host:port/database
Where dialect is the name of the underlying driver (mysql, sqlite, oracle, postgres) and database is the
name of the database running on the server at the host computer.
If users want to use a different SQLite database, pegasus-monitord requires them to specify the absolute
path of the alternate file. For example:
$ pegasus_monitord -r --dest sqlite:////home/user/diamond_database.db diamond-0.dag.dagman.out
Here are docs with details for all of the supported drivers:
http://www.sqlalchemy.org/docs/05/reference/dialects/index.html
Additional per-database options that work into the connection strings are outlined there.
It is important to note that one will need to have the appropriate db interface library installed. Which
is to say, SQLAlchemy is a wrapper around the mysql interface library (for instance), it does not provide
a MySQL driver itself. The Pegasus distribution includes both SQLAlchemy and the SQLite Python driver.
As a final note, it is important to mention that unlike when using SQLite databases, using SQLAlchemy
with other database servers, e.g. MySQL or Postgres, the target database needs to exist. So, if a user
wanted to connect to:
mysql://pegasus-user:supersecret@localhost:localport/diamond
it would need to first connect to the server at localhost and issue the appropriate create database
command before running pegasus-monitord as SQLAlchemy will take care of creating the tables and indexes
if they do not already exist.
SEE ALSO
pegasus-run(1)
AUTHORS
Gaurang Mehta <gmehta at isi dot edu>
Fabio Silva <fabio at isi dot edu>
Karan Vahi <vahi at isi dot edu>
Jens-S. Vöckler <voeckler at isi dot edu>
Pegasus Team http://pegasus.isi.edu
Pegasus 4.4.0 02/12/2016 PEGASUS-MONITORD(1)