Provided by: toil_3.24.0-1_all bug

NAME

       toil - Toil Documentation

       Toil is an open-source pure-Python workflow engine that lets people write better pipelines.

       Check  out  our website for a comprehensive list of Toil's features and read our paper to learn what Toil
       can do in the real world.  Please subscribe to our low-volume announce mailing list and feel free to also
       join us on GitHub and Gitter.

       If using Toil for your research, please cite
          Vivian,  J.,  Rao,  A.  A., Nothaft, F. A., Ketchum, C., Armstrong, J., Novak, A., … Paten, B. (2017).
          Toil enables reproducible, open source, big biomedical data  analyses.  Nature  Biotechnology,  35(4),
          314–316.  http://doi.org/10.1038/nbt.3772

QUICKSTART EXAMPLES

   Running a basic workflow
       A Toil workflow can be run with just two steps:

       1. Copy and paste the following code block into a new file called helloWorld.py:

          from toil.common import Toil
          from toil.job import Job

          def helloWorld(message, memory="1G", cores=1, disk="1G"):
              return "Hello, world!, here's a message: %s" % message

          if __name__ == "__main__":
              parser = Job.Runner.getDefaultArgumentParser()
              options = parser.parse_args()
              options.clean = "always"
              with Toil(options) as toil:
                  output = toil.start(Job.wrapFn(helloWorld, "You did it!"))
              print(output)

       2. Specify the name of the job store and run the workflow:

             python helloWorld.py file:my-job-store

       Congratulations! You've run your first Toil workflow using the default Batch System, singleMachine, using
       the file job store.

       Toil uses batch systems to manage the jobs it creates.

       The singleMachine batch system is primarily used to prepare and debug workflows on a local machine.  Once
       validated,  try  running  them  on a full-fledged batch system (see batchsysteminterface).  Toil supports
       many different batch systems such as Apache Mesos and Grid Engine; its versatility makes it easy  to  run
       your workflow in all kinds of places.

       Toil  is  totally  customizable!  Run  python  helloWorld.py  --help  to see a complete list of available
       options.

       For something beyond a "Hello, world!" example, refer to A (more) real-world example.

   Running a basic CWL workflow
       The Common Workflow Language (CWL) is an emerging standard for writing workflows that are portable across
       multiple workflow engines and platforms.  Running CWL workflows using Toil is easy.

       1. Copy and paste the following code block into example.cwl:

             cwlVersion: v1.0
             class: CommandLineTool
             baseCommand: echo
             stdout: output.txt
             inputs:
               message:
                 type: string
                 inputBinding:
                   position: 1
             outputs:
               output:
                 type: stdout

          and this code into example-job.yaml:

             message: Hello world!

       2. To run the workflow simply enter

             $ toil-cwl-runner example.cwl example-job.yaml

          Your output will be in output.txt:

             $ cat output.txt
             Hello world!

       To learn more about CWL, see the CWL User Guide (from where this example was shamelessly borrowed).

       To run this workflow on an AWS cluster have a look at Running a CWL Workflow on AWS.

       For information on using CWL with Toil see the section cwl

   Running a basic WDL workflow
       The  Workflow  Description  Language  (WDL)  is  another emerging language for writing workflows that are
       portable across multiple workflow engines and platforms.  Running WDL workflows using Toil  is  still  in
       alpha,  and  currently  experimental.   Toil  currently  supports basic workflow syntax (see wdl for more
       details and examples).  Here we go over running a basic WDL helloworld workflow.

       1. Copy and paste the following code block into wdl-helloworld.wdl:

                 workflow write_simple_file {
                   call write_file
                 }
                 task write_file {
                   String message
                   command { echo ${message} > wdl-helloworld-output.txt }
                   output { File test = "wdl-helloworld-output.txt" }
                 }

             and this code into ``wdl-helloworld.json``::

                 {
                   "write_simple_file.write_file.message": "Hello world!"
                 }

       2. To run the workflow simply enter

             $ toil-wdl-runner wdl-helloworld.wdl wdl-helloworld.json

          Your output will be in wdl-helloworld-output.txt:

             $ cat wdl-helloworld-output.txt
             Hello world!

       To learn more about WDL, see the main WDL website .

   A (more) real-world example
       For a more detailed example and explanation,  we've  developed  a  sample  pipeline  that  merge-sorts  a
       temporary  file.  This  is  not  supposed  to be an efficient sorting program, rather a more fully worked
       example of what Toil is capable of.

   Running the example
       1. Download the example code

       2. Run it with the default settings:

             $ python sort.py file:jobStore

          The workflow created a file called sortedFile.txt in your current directory.  Have a look  at  it  and
          notice that it contains a whole lot of sorted lines!

          This  workflow  does  a  smart  merge  sort  on a file it generates, fileToSort.txt. The sort is smart
          because each step of the process---splitting the file into separate chunks, sorting these chunks,  and
          merging  them  back  together---is compartmentalized into a job. Each job can specify its own resource
          requirements and will only be run after the jobs it depends upon have run. Jobs  without  dependencies
          will be run in parallel.

       NOTE:
          Delete  fileToSort.txt before moving on to #3. This example introduces options that specify dimensions
          for fileToSort.txt, if it does not already exist. If it exists, this workflow will  use  the  existing
          file and the results will be the same as #2.

       3. Run with custom options:

             $ python sort.py file:jobStore --numLines=5000 --lineLength=10 --overwriteOutput=True --workDir=/tmp/

          Here  we  see that we can add our own options to a Toil script. As noted above, the first two options,
          --numLines and --lineLength, determine the number of lines and how many characters are in  each  line.
          --overwriteOutput  causes  the  current  contents  of  sortedFile.txt to be overwritten, if it already
          exists.  The last option, --workDir, is an option built into Toil to  specify  where  temporary  files
          unique to a job are kept.

   Describing the source code
       To understand the details of what's going on inside.  Let's start with the main() function. It looks like
       a lot of code, but don't worry---we'll break it down piece by piece.

          def main(options=None):
              if not options:
                  # deal with command line arguments
                  parser = ArgumentParser()
                  Job.Runner.addToilOptions(parser)
                  parser.add_argument('--numLines', default=defaultLines, help='Number of lines in file to sort.', type=int)
                  parser.add_argument('--lineLength', default=defaultLineLen, help='Length of lines in file to sort.', type=int)
                  parser.add_argument("--fileToSort", help="The file you wish to sort")
                  parser.add_argument("--outputFile", help="Where the sorted output will go")
                  parser.add_argument("--overwriteOutput", help="Write over the output file if it already exists.", default=True)
                  parser.add_argument("--N", dest="N",
                                      help="The threshold below which a serial sort function is used to sort file. "
                                           "All lines must of length less than or equal to N or program will fail",
                                      default=10000)
                  parser.add_argument('--downCheckpoints', action='store_true',
                                      help='If this option is set, the workflow will make checkpoints on its way through'
                                           'the recursive "down" part of the sort')
                  parser.add_argument("--sortMemory", dest="sortMemory",
                                  help="Memory for jobs that sort chunks of the file.",
                                  default=None)

                  parser.add_argument("--mergeMemory", dest="mergeMemory",
                                  help="Memory for jobs that collate results.",
                                  default=None)

                  options = parser.parse_args()
              if not hasattr(options, "sortMemory") or not options.sortMemory:
                  options.sortMemory = sortMemory
              if not hasattr(options, "mergeMemory") or not options.mergeMemory:
                  options.mergeMemory = sortMemory

              # do some input verification
              sortedFileName = options.outputFile or "sortedFile.txt"
              if not options.overwriteOutput and os.path.exists(sortedFileName):
                  print("the output file {} already exists. Delete it to run the sort example again or use --overwriteOutput=True".format(sortedFileName))
                  exit()

              fileName = options.fileToSort
              if options.fileToSort is None:
                  # make the file ourselves
                  fileName = 'fileToSort.txt'
                  if os.path.exists(fileName):
                      print("Sorting existing file: {}".format(fileName))
                  else:
                      print('No sort file specified. Generating one automatically called: {}.'.format(fileName))
                      makeFileToSort(fileName=fileName, lines=options.numLines, lineLen=options.lineLength)
              else:
                  if not os.path.exists(options.fileToSort):
                      raise RuntimeError("File to sort does not exist: %s" % options.fileToSort)

              if int(options.N) <= 0:
                  raise RuntimeError("Invalid value of N: %s" % options.N)

              # Now we are ready to run
              with Toil(options) as workflow:
                  sortedFileURL = 'file://' + os.path.abspath(sortedFileName)
                  if not workflow.options.restart:
                      sortFileURL = 'file://' + os.path.abspath(fileName)
                      sortFileID = workflow.importFile(sortFileURL)
                      sortedFileID = workflow.start(Job.wrapJobFn(setup, sortFileID, int(options.N), options.downCheckpoints, options=options,
                                                              memory=sortMemory))
                  else:
                      sortedFileID = workflow.restart()
                  workflow.exportFile(sortedFileID, sortedFileURL)

       First we make a parser to process command line arguments using the argparse module. It's  important  that
       we  add  the  call  to  Job.Runner.addToilOptions()  to  initialize our parser with all of Toil's default
       options. Then we add the command line arguments unique to this workflow, and parse the  input.  The  help
       message listed with the arguments should give you a pretty good idea of what they can do.

       Next  we  do  a  little bit of verification of the input arguments. The option --fileToSort allows you to
       specify a file that needs to be sorted. If this option isn't given, it's here that we make our  own  file
       with the call to makeFileToSort().

       Finally  we come to the context manager that initializes the workflow. We create a path to the input file
       prepended with 'file://' as per the documentation for toil.common.Toil() when  staging  a  file  that  is
       stored  locally.  Notice that we have to check whether or not the workflow is restarting so that we don't
       import the file more than once.  Finally we can kick off the workflow by calling toil.common.Toil.start()
       on the job setup. When the workflow ends we capture its output (the sorted file's fileID) and use that in
       toil.common.Toil.exportFile() to move the sorted file from the job store back into "userland".

       Next let's look at the job that begins the actual workflow, setup.

          def setup(job, inputFile, N, downCheckpoints, options):
              """
              Sets up the sort.
              Returns the FileID of the sorted file
              """
              RealtimeLogger.info("Starting the merge sort")
              return job.addChildJobFn(down,
                                       inputFile, N, 'root',
                                       downCheckpoints,
                                       options = options,
                                       preemptable=True,
                                       memory=sortMemory).rv()

       setup really only does two  things.  First  it  writes  to  the  logs  using  Job.log()  and  then  calls
       addChildJobFn().  Child  jobs  run directly after the current job. This function turns the 'job function'
       down into an actual job and passes in the inputs including an optional resource requirement, memory.  The
       job  doesn't  actually  get  run  until  the  call to Job.rv(). Once the job down finishes, its output is
       returned here.

       Now we can look at what down does.

          def down(job, inputFileStoreID, N, path, downCheckpoints, options, memory=sortMemory):
              """
              Input is a file, a subdivision size N, and a path in the hierarchy of jobs.
              If the range is larger than a threshold N the range is divided recursively and
              a follow on job is then created which merges back the results else
              the file is sorted and placed in the output.
              """

              RealtimeLogger.info("Down job starting: %s" % path)

              # Read the file
              inputFile = job.fileStore.readGlobalFile(inputFileStoreID, cache=False)
              length = os.path.getsize(inputFile)
              if length > N:
                  # We will subdivide the file
                  RealtimeLogger.critical("Splitting file: %s of size: %s"
                          % (inputFileStoreID, length))
                  # Split the file into two copies
                  midPoint = getMidPoint(inputFile, 0, length)
                  t1 = job.fileStore.getLocalTempFile()
                  with open(t1, 'w') as fH:
                      fH.write(copySubRangeOfFile(inputFile, 0, midPoint+1))
                  t2 = job.fileStore.getLocalTempFile()
                  with open(t2, 'w') as fH:
                      fH.write(copySubRangeOfFile(inputFile, midPoint+1, length))
                  # Call down recursively. By giving the rv() of the two jobs as inputs to the follow-on job, up,
                  # we communicate the dependency without hindering concurrency.
                  result = job.addFollowOnJobFn(up,
                                              job.addChildJobFn(down, job.fileStore.writeGlobalFile(t1), N, path + '/0',
                                                                downCheckpoints, checkpoint=downCheckpoints, options=options,
                                                                preemptable=True, memory=options.sortMemory).rv(),
                                              job.addChildJobFn(down, job.fileStore.writeGlobalFile(t2), N, path + '/1',
                                                                downCheckpoints, checkpoint=downCheckpoints, options=options,
                                                                preemptable=True, memory=options.mergeMemory).rv(),
                                              path + '/up', preemptable=True, options=options, memory=options.sortMemory).rv()
              else:
                  # We can sort this bit of the file
                  RealtimeLogger.critical("Sorting file: %s of size: %s"
                          % (inputFileStoreID, length))
                  # Sort the copy and write back to the fileStore
                  shutil.copyfile(inputFile, inputFile + '.sort')
                  sort(inputFile + '.sort')
                  result = job.fileStore.writeGlobalFile(inputFile + '.sort')

              RealtimeLogger.info("Down job finished: %s" % path)
              return result

       Down is the recursive part of the workflow. First we read the file into the local  filestore  by  calling
       job.fileStore.readGlobalFile().  This  puts  a copy of the file in the temp directory for this particular
       job. This storage will disappear once this job ends. For a detailed explanation  of  the  filestore,  job
       store, and their interfaces have a look at managingFiles.

       Next  down checks the base case of the recursion: is the length of the input file less than N (remember N
       was an option we added to the workflow in main)? In the base case, we just sort the file, and return  the
       file ID of this new sorted file.

       If    the    base   case   fails,   then   the   file   is   split   into   two   new   tempFiles   using
       job.fileStore.getLocalTempFile() and the helper function copySubRangeOfFile. Finally we add a  follow  on
       Job  up  with  job.addFollowOnJobFn().  We've already seen child jobs. A follow-on Job is a job that runs
       after the current job and all of its children (and their children and follow-ons) have completed. Using a
       follow-on makes sense because up is responsible for merging the files together and we don't want to merge
       the files together until we know they are sorted. Again,  the  return  value  of  the  follow-on  job  is
       requested using Job.rv().

       Looking at up

          def up(job, inputFileID1, inputFileID2, path, options, memory=sortMemory):
              """
              Merges the two files and places them in the output.
              """

              RealtimeLogger.info("Up job starting: %s" % path)

              with job.fileStore.writeGlobalFileStream() as (fileHandle, outputFileStoreID):
                  fileHandle = codecs.getwriter('utf-8')(fileHandle)
                  with job.fileStore.readGlobalFileStream(inputFileID1) as inputFileHandle1:
                      inputFileHandle1 = codecs.getreader('utf-8')(inputFileHandle1)
                      with job.fileStore.readGlobalFileStream(inputFileID2) as inputFileHandle2:
                          inputFileHandle2 = codecs.getreader('utf-8')(inputFileHandle2)
                          RealtimeLogger.info("Merging %s and %s to %s"
                              % (inputFileID1, inputFileID2, outputFileStoreID))
                          merge(inputFileHandle1, inputFileHandle2, fileHandle)
                  # Cleanup up the input files - these deletes will occur after the completion is successful.
                  job.fileStore.deleteGlobalFile(inputFileID1)
                  job.fileStore.deleteGlobalFile(inputFileID2)

                  RealtimeLogger.info("Up job finished: %s" % path)

                  return outputFileStoreID

       we  see  that  the  two  input  files  are  merged together and the output is written to a new file using
       job.fileStore.writeGlobalFileStream(). After a little cleanup, the output file is returned.

       Once the final up finishes and all of the rv() promises are fulfilled, main receives the sorted file's ID
       which it uses in exportFile to send it to the user.

       There are other things in this example that we didn't go over such as checkpoints and the details of much
       of the api.

       At the end of the script the lines

          if __name__ == '__main__'
              main()

       are included to ensure that the main function is only run once in the '__main__' process invoked by  you,
       the  user.   In  Toil  terms,  by  invoking the script you created the leader process in which the main()
       function is run. A worker process is a separate process whose sole purpose is to host  the  execution  of
       one  or  more  jobs  defined in that script. In any Toil workflow there is always one leader process, and
       potentially many worker processes.

       When using the single-machine batch system (the default), the worker processes will  be  running  on  the
       same  machine as the leader process. With full-fledged batch systems like Mesos the worker processes will
       typically be started on separate machines. The boilerplate ensures that  the  pipeline  is  only  started
       once---on  the  leader---but  not  when  its  job  functions  are imported and executed on the individual
       workers.

       Typing python sort.py --help will show the complete list of arguments for  the  workflow  which  includes
       both  Toil's  and ones defined inside sort.py. A complete explanation of Toil's arguments can be found in
       commandRef.

   Logging
       By default, Toil logs a lot of information related to the current environment  in  addition  to  messages
       from the batch system and jobs. This can be configured with the --logLevel flag. For example, to only log
       CRITICAL level messages to the screen:

          $ python sort.py file:jobStore --logLevel=critical --overwriteOutput=True

       This hides most of the information we get from the Toil run. For more detail, we  can  run  the  pipeline
       with --logLevel=debug to see a comprehensive output. For more information, see workflowOptions.

   Error Handling and Resuming Pipelines
       With  Toil,  you  can  recover  gracefully  from  a bug in your pipeline without losing any progress from
       successfully completed jobs. To demonstrate this, let's add a bug to our example code  to  see  how  Toil
       handles  a failure and how we can resume a pipeline after that happens. Add a bad assertion at line 52 of
       the example (the first line of down()):

          def down(job, inputFileStoreID, N, downCheckpoints, memory=sortMemory):
              ...
              assert 1 == 2, "Test error!"

       When we run the pipeline, Toil will show a detailed failure log with a traceback:

          $ python sort.py file:jobStore
          ...
          ---TOIL WORKER OUTPUT LOG---
          ...
          m/j/jobonrSMP    Traceback (most recent call last):
          m/j/jobonrSMP      File "toil/src/toil/worker.py", line 340, in main
          m/j/jobonrSMP        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
          m/j/jobonrSMP      File "toil/src/toil/job.py", line 1270, in _runner
          m/j/jobonrSMP        returnValues = self._run(jobGraph, fileStore)
          m/j/jobonrSMP      File "toil/src/toil/job.py", line 1217, in _run
          m/j/jobonrSMP        return self.run(fileStore)
          m/j/jobonrSMP      File "toil/src/toil/job.py", line 1383, in run
          m/j/jobonrSMP        rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
          m/j/jobonrSMP      File "toil/example.py", line 30, in down
          m/j/jobonrSMP        assert 1 == 2, "Test error!"
          m/j/jobonrSMP    AssertionError: Test error!

       If we try and run the pipeline again, Toil will give us an error message saying that a job store  of  the
       same  name  already exists. By default, in the event of a failure, the job store is preserved so that the
       workflow can be restarted, starting from the previously failed jobs.  We  can  restart  the  pipeline  by
       running

          $ python sort.py file:jobStore --restart --overwriteOutput=True

       We can also change the number of times Toil will attempt to retry a failed job:

          $ python sort.py file:jobStore --retryCount 2 --restart --overwriteOutput=True

       You'll  now  see Toil attempt to rerun the failed job until it runs out of tries.  --retryCount is useful
       for non-systemic errors, like downloading a file that may experience a  sporadic  interruption,  or  some
       other non-deterministic failure.

       To  successfully  restart  our pipeline, we can edit our script to comment out line 30, or remove it, and
       then run

          $ python sort.py file:jobStore --restart --overwriteOutput=True

       The pipeline will run successfully, and the job store will be removed on the pipeline's completion.

   Collecting Statistics
       Please see the cli_status section for more on gathering runtime and resource info on jobs.

   Launching a Toil Workflow in AWS
       After having installed the  aws  extra  for  Toil  during  the  installation-ref  and  set  up  AWS  (see
       prepareAWS),  the  user  can run the basic helloWorld.py script (Running a basic workflow) on a VM in AWS
       just by modifying the run command.

       Note that when running in AWS, users can either run the workflow on a single instance  or  run  it  on  a
       cluster (which is running across multiple containers on multiple AWS instances).  For more information on
       running Toil workflows on a cluster, see runningAWS.

       Also!  Remember to use the destroyCluster command when finished to destroy the cluster!  Otherwise things
       may not be cleaned up properly.

       1. Launch a cluster in AWS using the launchCluster command:

             $ toil launch-cluster <cluster-name> --keyPairName <AWS-key-pair-name> --leaderNodeType t2.medium --zone us-west-2a

          The arguments keyPairName, leaderNodeType, and zone are required to launch a cluster.

       2. Copy helloWorld.py to the /tmp directory on the leader node using the rsyncCluster command:

             $ toil rsync-cluster --zone us-west-2a <cluster-name> helloWorld.py :/tmp

          Note that the command requires defining the file to copy as well as the target location on the cluster
          leader node.

       3. Login to the cluster leader node using the sshCluster command:

             $ toil ssh-cluster --zone us-west-2a <cluster-name>

          Note that this command will log you in as the root user.

       4. Run the Toil script in the cluster:

             $ python /tmp/helloWorld.py aws:us-west-2:my-S3-bucket

          In this particular case, we create an S3 bucket called my-S3-bucket in the us-west-2 availability zone
          to store intermediate job results.

          Along  with some other INFO log messages, you should get the following output in your terminal window:
          Hello, world!, here's a message: You did it!.

       5. Exit from the SSH connection.

             $ exit

       6. Use the destroyCluster command to destroy the cluster:

             $ toil destroy-cluster --zone us-west-2a <cluster-name>

          Note that this command will destroy the cluster leader node and any resources created to run the  job,
          including the S3 bucket.

   Running a CWL Workflow on AWS
       After  having  installed  the aws and cwl extras for Toil during the installation-ref and set up AWS (see
       prepareAWS), the user can run a CWL workflow with Toil on AWS.

       Also!  Remember to use the destroyCluster command when finished to destroy the cluster!  Otherwise things
       may not be cleaned up properly.

       1. First launch a node in AWS using the launchCluster command:

             $ toil launch-cluster <cluster-name> --keyPairName <AWS-key-pair-name> --leaderNodeType t2.medium --zone us-west-2a

       2. Copy example.cwl and example-job.yaml from the CWL example to the node using the rsyncCluster command:

             $ toil rsync-cluster --zone us-west-2a <cluster-name> example.cwl :/tmp
             $ toil rsync-cluster --zone us-west-2a <cluster-name> example-job.yaml :/tmp

       3. SSH into the cluster's leader node using the sshCluster utility:

             $ toil ssh-cluster --zone us-west-2a <cluster-name>

       4. Once on the leader node, it's a good idea to update and install the following:

             sudo apt-get update
             sudo apt-get -y upgrade
             sudo apt-get -y dist-upgrade
             sudo apt-get -y install git
             sudo pip install mesos.cli

       5. Now create a new virtualenv with the --system-site-packages option and activate:

             virtualenv --system-site-packages venv
             source venv/bin/activate

       6. Now run the CWL workflow:

             $ toil-cwl-runner --provisioner aws --jobStore aws:us-west-2a:any-name /tmp/example.cwl /tmp/example-job.yaml

          TIP:
             When  running a CWL workflow on AWS, input files can be provided either on the local file system or
             in S3 buckets using s3:// URI references. Final output files will  be  copied  to  the  local  file
             system of the leader node.

       7. Finally, log out of the leader node and from your local computer, destroy the cluster:

             $ toil destroy-cluster --zone us-west-2a <cluster-name>

   Running a Workflow with Autoscaling - Cactus
       Cactus  is  a reference-free, whole-genome multiple alignment program that can be run on any of the cloud
       platforms Toil supports.

       NOTE:
          Cloud Independence:

          This example provides a "cloud agnostic" view of running Cactus  with  Toil.  Most  options  will  not
          change  between  cloud  providers.  However, each provisioner has unique inputs for  --leaderNodeType,
          --nodeType and --zone.  We recommend the following:

                               ┌─────────────────┬────────────────┬────────────┬───────────────┐
                               │Option           │ Used in        │ AWS        │ Google        │
                               ├─────────────────┼────────────────┼────────────┼───────────────┤
                               │--leaderNodeType │ launch-cluster │ t2.medium  │ n1-standard-1 │
                               ├─────────────────┼────────────────┼────────────┼───────────────┤
                               │--zone           │ launch-cluster │ us-west-2a │ us-west1-a    │
                               ├─────────────────┼────────────────┼────────────┼───────────────┤
                               │--zone           │ cactus         │ us-west-2  │               │
                               ├─────────────────┼────────────────┼────────────┼───────────────┤
                               │--nodeType       │ cactus         │ c3.4xlarge │ n1-standard-8 │
                               └─────────────────┴────────────────┴────────────┴───────────────┘

          When executing toil launch-cluster with gce specified for --provisioner, the  option  --boto  must  be
          specified  and  given  a path to your .boto file. See runningGCE for more information about the --boto
          option.

       Also!  Remember to use the destroyCluster command when finished to destroy the cluster!  Otherwise things
       may not be cleaned up properly.

       1.  Download pestis.tar.gz

       2.  Launch a leader node using the launchCluster command:

              $ toil launch-cluster <cluster-name> --provisioner <aws, gce> --keyPairName <key-pair-name> --leaderNodeType <type> --zone <zone>

           NOTE:
              A Helpful Tip

              When  using  AWS,  setting the environment variable eliminates having to specify the --zone option
              for each command. This will be supported for GCE in the future.

                  $ export TOIL_AWS_ZONE=us-west-2c

       3.  Create appropriate directory for uploading files:

              $ toil ssh-cluster --provisioner <aws, gce> <cluster-name>
              $ mkdir /root/cact_ex
              $ exit

       4.  Copy the required files, i.e., seqFile.txt (a  text  file  containing  the  locations  of  the  input
           sequences  as  well  as their phylogenetic tree, see here), organisms' genome sequence files in FASTA
           format, and configuration files (e.g. blockTrim1.xml, if desired), up to the leader node:

              $ toil rsync-cluster --provisioner <aws, gce> <cluster-name> pestis-short-aws-seqFile.txt :/root/cact_ex
              $ toil rsync-cluster --provisioner <aws, gce> <cluster-name> GCF_000169655.1_ASM16965v1_genomic.fna :/root/cact_ex
              $ toil rsync-cluster --provisioner <aws, gce> <cluster-name> GCF_000006645.1_ASM664v1_genomic.fna :/root/cact_ex
              $ toil rsync-cluster --provisioner <aws, gce> <cluster-name> GCF_000182485.1_ASM18248v1_genomic.fna :/root/cact_ex
              $ toil rsync-cluster --provisioner <aws, gce> <cluster-name> GCF_000013805.1_ASM1380v1_genomic.fna :/root/cact_ex
              $ toil rsync-cluster --provisioner <aws, gce> <cluster-name> setup_leaderNode.sh :/root/cact_ex
              $ toil rsync-cluster --provisioner <aws, gce> <cluster-name> blockTrim1.xml :/root/cact_ex
              $ toil rsync-cluster --provisioner <aws, gce> <cluster-name> blockTrim3.xml :/root/cact_ex

       5.  Log in to the leader node:

              $ toil ssh-cluster --provisioner <aws, gce> <cluster-name>

       6.  Set up the environment of the leader node to run Cactus:

              $ bash /root/cact_ex/setup_leaderNode.sh
              $ source cact_venv/bin/activate
              (cact_venv) $ cd cactus
              (cact_venv) $ pip install --upgrade .

       7.  Run Cactus as an autoscaling workflow:

              (cact_venv) $ TOIL_APPLIANCE_SELF=quay.io/ucsc_cgl/toil:3.14.0 cactus --provisioner <aws, gce> --nodeType <type> --maxNodes 2 --minNodes 0 --retry 10 --batchSystem mesos --disableCaching --logDebug --logFile /logFile_pestis3 --configFile /root/cact_ex/blockTrim3.xml <aws, google>:<zone>:cactus-pestis /root/cact_ex/pestis-short-aws-seqFile.txt /root/cact_ex/pestis_output3.hal

           NOTE:
              Pieces of the Puzzle:

              TOIL_APPLIANCE_SELF=quay.io/ucsc_cgl/toil:3.14.0 --- specifies the version  of  Toil  being  used,
              3.14.0; if the latest one is desired, please eliminate.

              --nodeType  ---  determines  the  instance type used for worker nodes. The instance type specified
              here must be on the same cloud provider as the one specified with --leaderNodeType

              --maxNodes 2 --- creates up to two instances of the type specified with  --nodeType  and  launches
              Mesos worker containers inside them.

              --logDebug --- equivalent to --logLevel DEBUG.

              --logFile /logFile_pestis3 --- writes logs in a file named logFile_pestis3 under / folder.

              --configFile  ---  this  is  not  required  depending  on whether a specific configuration file is
              intended to run the alignment.

              <aws, google>:<zone>:cactus-pestis --- creates a bucket, named cactus-pestis, with  the  specified
              cloud provider to store intermediate job files and metadata.  NOTE: If you want to use a GCE-based
              jobstore, specify google here, not gce.

              The result file, named pestis_output3.hal, is stored under  /root/cact_ex  folder  of  the  leader
              node.

              Use cactus --help to see all the Cactus and Toil flags available.

       8.  Log out of the leader node:

              (cact_venv) $ exit

       9.  Download the resulted output to local machine:

              $ toil rsync-cluster --provisioner <aws, gce> <cluster-name>  :/root/cact_ex/pestis_output3.hal <path-of-folder-on-local-machine>

       10. Destroy the cluster:

              $ toil destroy-cluster --provisioner <aws, gce> <cluster-name>

INTRODUCTION

       Toil  runs  in  various  environments, including locally and in the cloud (Amazon Web Services and Google
       Compute Engine).  Toil also supports two DSLs: CWL and (Amazon Web Services and Google  Compute  Engine).
       Toil also supports two DSLs: CWL and WDL (experimental).

       Toil  is  built  in a modular way so that it can be used on lots of different systems, and with different
       configurations.  The three configurable pieces are the

          • jobStoreInterface: A filepath or url that can host and centralize all files for a workflow  (e.g.  a
            local folder, or an AWS s3 bucket url).

          • batchSystemInterface:  Specifies  either  a  local  single-machine  or  a  currently  supported  HPC
            environment (lsf, parasol, mesos, slurm, torque, htcondor, or gridengine).  Mesos is a special case,
            and is launched for cloud environments.

          • Provisioner:  For running in the cloud only.  This specifies which cloud provider provides instances
            to do the "work" of your workflow.

   Job Store
       The job store is a storage abstraction which contains all of the information used in  a  Toil  run.  This
       centralizes  all  of  the  files used by jobs in the workflow and also the details of the progress of the
       run. If a workflow crashes or fails, the job store contains all of the information  necessary  to  resume
       with minimal repetition of work.

       Several different job stores are supported, including the file job store and cloud job stores.

   File Job Store
       The  file  job store is for use locally, and keeps the workflow information in a directory on the machine
       where the workflow is launched.  This is the simplest and most convenient job store for  testing  or  for
       small runs.

       For an example that uses the file job store, see quickstart.

   Cloud Job Stores
       Toil currently supports the following cloud storage systems as job stores:

          • awsJobStore:  An  AWS  S3 bucket formatted as "aws:<zone>:<bucketname>" where only numbers, letters,
            and dashes are allowed in the bucket name.  Example: aws:us-west-2:my-aws-jobstore-name.

          • googleJobStore: A Google Cloud Storage bucket  formatted  as  "gce:<zone>:<bucketname>"  where  only
            numbers,    letters,    and    dashes    are    allowed    in    the    bucket    name.     Example:
            gce:us-west2-a:my-google-jobstore-name.

       These use cloud buckets to house all of the files. This is useful if there are several  different  worker
       machines all running jobs that need to access the job store.

   Batch System
       A  Toil batch system is either a local single-machine (one computer) or a currently supported HPC cluster
       of computers (lsf, parasol, mesos, slurm, torque, htcondor, or gridengine).  Mesos is a special case, and
       is  launched  for  cloud  environments.  These environments manage individual worker nodes under a leader
       node to process the work required in a workflow.  The leader and its workers all coordinate  their  tasks
       and files through a centralized job store location.

       See batchSystemInterface for a more detailed description of different batch systems.

   Provisioner
       The Toil provisioner provides a tool set for running a Toil workflow on a particular cloud platform.

       The  clusterRef  are  command  line  tools  used to provision nodes in your desired cloud platform.  They
       allows you to launch nodes, ssh to the leader, and rsync files back and forth.

       For detailed instructions for using the provisioner see runningAWS or runningGCE.

COMMANDLINE OPTIONS

       A quick way to see all of Toil's commandline options is by executing the following on a toil script:

          $ toil example.py --help

       For a basic toil workflow, Toil has one mandatory argument, the  job  store.   All  other  arguments  are
       optional.

   The Job Store
       Running  toil  scripts  requires a filepath or url to a centralizing location for all of the files of the
       workflow.  This is Toil's one required positional  argument:  the  job  store.   To  use  the  quickstart
       example,  if  you're  on  a  node  that has a large /scratch volume, you can specify that the jobstore be
       created there by executing:  python  HelloWorld.py  /scratch/my-job-store,  or  more  explicitly,  python
       HelloWorld.py file:/scratch/my-job-store.

       Syntax for specifying different job stores:
          Local: file:job-store-name

          AWS: aws:region-here:job-store-name

          Google: google:projectID-here:job-store-name

       Different types of job store options can be found below.

   Commandline Options
       Core Toil Options

          --workDir WORKDIR
                 Absolute  path  to  directory  where  temporary  files  generated during the Toil run should be
                 placed. Temp files and folders, as well as standard output and error  from  batch  system  jobs
                 (unless  --noStdOutErr),  will  be placed in a directory toil-<workflowID> within workDir.  The
                 workflowID is generated by Toil  and  will  be  reported  in  the  workflow  logs.  Default  is
                 determined  by  the variables (TMPDIR, TEMP, TMP) via mkdtemp. This directory needs to exist on
                 all machines running jobs; if capturing standard output and error from  batch  system  jobs  is
                 desired, it will generally need to be on a shared file system.

          --noStdOutErr
                 Do not capture standard output and error from batch system jobs.

          --stats
                 Records statistics about the toil workflow to be used by 'toil stats'.

          --clean=STATE
                 Determines  the  deletion  of  the  jobStore upon completion of the program. Choices: 'always',
                 'onError','never', or 'onSuccess'. The -\-stats option requires information from  the  jobStore
                 upon  completion  so the jobStore will never be deleted with that flag.  If you wish to be able
                 to restart the run, choose 'never' or 'onSuccess'. Default is 'never' if stats is enabled,  and
                 'onSuccess' otherwise

          --cleanWorkDir STATE
                 Determines  deletion of temporary worker directory upon completion of a job. Choices: 'always',
                 'never', 'onSuccess'. Default = always. WARNING: This option should be  changed  for  debugging
                 only. Running a full pipeline with this option could fill your disk with intermediate data.

          --clusterStats FILEPATH
                 If  enabled, writes out JSON resource usage statistics to a file. The default location for this
                 file is the current working directory, but an absolute path can also be passed to specify where
                 this file should be written. This option only applies when using scalable batch systems.

          --restart
                 If  -\-restart  is  specified  then  will  attempt to restart existing workflow at the location
                 pointed to by the -\-jobStore option. Will raise an exception if the workflow does not exist.

       Logging Options

       Toil hides stdout and stderr by default except in case of job failure.  Log levels in toil are  based  on
       priority from the logging module:

          --logOff
                 Only CRITICAL log levels are shown.  Equivalent to --logLevel=OFF or --logLevel=CRITICAL.

          --logCritical
                 Only CRITICAL log levels are shown.  Equivalent to --logLevel=OFF or --logLevel=CRITICAL.

          --logError
                 Only ERROR, and CRITICAL log levels are shown.  Equivalent to --logLevel=ERROR.

          --logWarning
                 Only WARN, ERROR, and CRITICAL log levels are shown.  Equivalent to --logLevel=WARNING.

          --logInfo
                 All log statements are shown, except DEBUG.  Equivalent to --logLevel=INFO.

          --logDebug
                 All log statements are shown.  Equivalent to --logLevel=DEBUG.

          --logLevel=LOGLEVEL
                 May be set to: OFF (or CRITICAL), ERROR, WARN (or WARNING), INFO, or DEBUG.

          --logFile FILEPATH
                 Specifies a file path to write the logging output to.

          --rotatingLogging
                 Turn  on  rotating  logging,  which  prevents  log  files  from  getting  too  big  (set  using
                 --maxLogFileSize BYTESIZE).

          --maxLogFileSize BYTESIZE
                 Sets the maximum log file size in bytes (--rotatingLogging must be active).

       Batch System Options

          --batchSystem BATCHSYSTEM
                 The type of batch system to run the job(s) with, currently can be one  of  LSF,  Mesos,  Slurm,
                 Torque, HTCondor, singleMachine, parasol, gridEngine'.  (default: singleMachine)

          --parasolCommand PARASOLCOMMAND
                 The  name  or  path  of  the parasol program. Will be looked up on PATH unless it starts with a
                 slash. (default: parasol)

          --parasolMaxBatches PARASOLMAXBATCHES
                 Maximum number of job batches the Parasol batch is allowed to create. One batch is created  for
                 jobs with a unique set of resource requirements. (default: 1000)

          --scale SCALE
                 A  scaling  factor  to  change  the  value  of  all  submitted  tasks' submitted cores. Used in
                 singleMachine batch system. (default: 1)

          --linkImports
                 When using Toil's importFile function for staging, input files are copied  to  the  job  store.
                 Specifying  this  option  saves  space  by  sym-linking  imported files.  As long as caching is
                 enabled Toil will protect the file automatically by changing the permissions to read-only.

          --mesosMaster MESOSMASTERADDRESS
                 The host and port of the Mesos master separated by a colon. (default: 169.233.147.202:5050)

       Autoscaling Options

          --provisioner CLOUDPROVIDER
                 The provisioner for cluster auto-scaling. The currently supported choices are 'aws'  or  'gce'.
                 The default is None.

          --nodeTypes NODETYPES
                 List  of  node  types  separated  by  commas.  The  syntax  for  each  node type depends on the
                 provisioner used. For the cgcloud and AWS provisioners this is the  name  of  an  EC2  instance
                 type,  optionally  followed  by  a colon and the price in dollars to bid for a spot instance of
                 that type, for example 'c3.8xlarge:0.42'. If no spot bid is specified, nodes of this type  will
                 be  non-preemptable.   It  is  acceptable  to  specify  an  instance  as  both  preemptable and
                 non-preemptable, including it twice in the list. In that case, preemptable nodes of  that  type
                 will  be  preferred  when  creating new nodes once the maximum number of preemptable-nodes have
                 been reached.

          --nodeOptions NODEOPTIONS
                 Options for provisioning the nodes. The syntax depends on the  provisioner  used.  Neither  the
                 CGCloud nor the AWS provisioner support any node options.

          --minNodes MINNODES
                 Minimum  number  of  nodes  of  each type in the cluster, if using auto-scaling. This should be
                 provided as a comma-separated list of the same length as the list of node types. default=0

          --maxNodes MAXNODES
                 Maximum number of nodes of each type in the  cluster,  if  using  autoscaling,  provided  as  a
                 comma-separated  list. The first value is used as a default if the list length is less than the
                 number of nodeTypes.  default=10

          --preemptableCompensation PREEMPTABLECOMPENSATION
                 The preference of the autoscaler to replace preemptable nodes with non-preemptable nodes,  when
                 preemptable  nodes  cannot  be  started  for  some reason.  Defaults to 0.0. This value must be
                 between 0.0 and 1.0, inclusive. A value of 0.0 disables  such  compensation,  a  value  of  0.5
                 compensates  two  missing preemptable nodes with a non-preemptable one. A value of 1.0 replaces
                 every missing pre-emptable node with a non-preemptable one.

          --nodeStorage NODESTORAGE
                 Specify the size of the root volume of worker nodes when they are launched  in  gigabytes.  You
                 may want to set this if your jobs require a lot of disk space. The default value is 50.

          --metrics
                 Enable  the  prometheus/grafana  dashboard for monitoring CPU/RAM usage, queue size, and issued
                 jobs.

          --defaultMemory INT
                 The default amount of memory to request for a job.  Only applicable to jobs that do not specify
                 an  explicit  value  for  this  requirement.  Standard  suffixes like K, Ki, M, Mi, G or Gi are
                 supported. Default is 2.0G

          --defaultCores FLOAT
                 The default number of CPU cores to dedicate a job.  Only applicable to jobs that do not specify
                 an  explicit value for this requirement. Fractions of a core (for example 0.1) are supported on
                 some batch systems, namely Mesos and singleMachine. Default is 1.0

          --defaultDisk INT
                 The default amount of disk space to dedicate a job.   Only  applicable  to  jobs  that  do  not
                 specify  an  explicit  value for this requirement. Standard suffixes like K, Ki, M, Mi, G or Gi
                 are supported. Default is 2.0G

          --maxCores INT
                 The maximum number of CPU cores to request from the batch system  at  any  one  time.  Standard
                 suffixes like K, Ki, M, Mi, G or Gi are supported.

          --maxMemory INT
                 The  maximum  amount  of  memory  to  request  from  the batch system at any one time. Standard
                 suffixes like K, Ki, M, Mi, G or Gi are supported.

          --maxDisk INT
                 The maximum amount of disk space to request from the batch system at  any  one  time.  Standard
                 suffixes like K, Ki, M, Mi, G or Gi are supported.

          --retryCount RETRYCOUNT
                 Number of times to retry a failing job before giving up and labeling job failed. default=1

          --maxJobDuration MAXJOBDURATION
                 Maximum  runtime of a job (in seconds) before we kill it (this is a lower bound, and the actual
                 time before killing the job may be longer).

          --rescueJobsFrequency RESCUEJOBSFREQUENCY
                 Period of time to wait (in seconds) between checking for missing/overlong jobs,  that  is  jobs
                 which get lost by the batch system.

          --maxServiceJobs MAXSERVICEJOBS
                 The maximum number of service jobs that can be run concurrently, excluding service jobs running
                 on preemptable nodes. default=9223372036854775807

          --maxPreemptableServiceJobs MAXPREEMPTABLESERVICEJOBS
                 The  maximum  number  of  service  jobs  that  can  run  concurrently  on  preemptable   nodes.
                 default=9223372036854775807

          --deadlockWait DEADLOCKWAIT
                 The  minimum  number of seconds to observe the cluster stuck running only the same service jobs
                 before throwing a deadlock exception. default=60

          --statePollingWait STATEPOLLINGWAIT
                 Time, in seconds, to wait before doing a scheduler query for job state. Return  cached  results
                 if within the waiting period.

          Miscellaneous Options

          --disableCaching
                 Disables  caching  in the file store. This flag must be set to use a batch system that does not
                 support caching such as Grid Engine, Parasol, LSF, or Slurm.

          --disableChaining
                 Disables chaining of jobs (chaining uses one job's resource allocation for its successor job if
                 possible).

          --maxLogFileSize MAXLOGFILESIZE
                 The  maximum  size  of  a  job  log file to keep (in bytes), log files larger than this will be
                 truncated to the last X bytes. Setting this option to zero will prevent any truncation. Setting
                 this option to a negative value will truncate from the beginning. Default=62.5 K

          --writeLogs FILEPATH
                 Write  worker  logs  received  by  the  leader  into their own files at the specified path. Any
                 non-empty standard output and error from failed batch system jobs will  also  be  written  into
                 files  at  this  path.  The  current  working directory will be used if a path is not specified
                 explicitly. Note: By default only the logs of failed jobs are returned to leader. Set log level
                 to  'debug'  to  get logs back from successful jobs, and adjust 'maxLogFileSize' to control the
                 truncation limit for worker logs.

          --writeLogsGzip FILEPATH
                 Identical to -\-writeLogs except the logs files are gzipped on the leader.

          --realTimeLogging
                 Enable real-time logging from workers to masters.

          --sseKey SSEKEY
                 Path to file containing 32 character key to be used for server-side encryption  on  awsJobStore
                 or googleJobStore. SSE will not be used if this flag is not passed.

          --setEnv NAME
                 NAME=VALUE or NAME, -e NAME=VALUE or NAME are also valid.  Set an environment variable early on
                 in the worker. If VALUE  is  omitted,  it  will  be  looked  up  in  the  current  environment.
                 Independently  of  this  option, the worker will try to emulate the leader's environment before
                 running a job. Using this option, a variable can be injected into  the  worker  process  itself
                 before it is started.

          --servicePollingInterval SERVICEPOLLINGINTERVAL
                 Interval  of  time  service  jobs wait between polling for the existence of the keep-alive flag
                 (default=60)

          --debugWorker
                 Experimental no forking mode for local debugging.  Specifically, workers  are  not  forked  and
                 stderr/stdout are not redirected to the log. (default=False)

   Restart Option
       In  the event of failure, Toil can resume the pipeline by adding the argument --restart and rerunning the
       python script. Toil pipelines can even  be  edited  and  resumed  which  is  useful  for  development  or
       troubleshooting.

   Running Workflows with Services
       Toil  supports  jobs,  or clusters of jobs, that run as services to other accessor jobs. Example services
       include server databases or Apache Spark Clusters. As service jobs exist to provide services to  accessor
       jobs  their  runtime  is  dependent  on  the  concurrent running of their accessor jobs. The dependencies
       between services and their accessor jobs can create potential deadlock scenarios, where  the  running  of
       the  workflow  hangs because only service jobs are being run and their accessor jobs can not be scheduled
       because of too limited resources to run both simultaneously. To cope with this situation Toil attempts to
       schedule services and accessors intelligently, however to avoid a deadlock with workflows running service
       jobs it is advisable to use the following parameters:

       • --maxServiceJobs: The maximum number of service jobs that can be run  concurrently,  excluding  service
         jobs running on preemptable nodes.

       • --maxPreemptableServiceJobs:  The  maximum  number  of  service  jobs  that  can  run  concurrently  on
         preemptable nodes.

       Specifying these parameters so that at a maximum cluster size there will be sufficient resources  to  run
       accessors in addition to services will ensure that such a deadlock can not occur.

       If  too  low  a  limit  is  specified then a deadlock can occur in which toil can not schedule sufficient
       service jobs concurrently to complete the workflow.  Toil will detect this situation  if  it  occurs  and
       throw  a  toil.DeadlockException exception. Increasing the cluster size and these limits will resolve the
       issue.

   Setting Options directly with the Toil Script
       It's good to remember that commandline options can be overridden in the Toil script itself.  For example,
       toil.job.Job.Runner.getDefaultOptions()  can  be  used  to run toil with all default options, and in this
       example, it will override  commandline  args  to  run  the  default  options  and  always  run  with  the
       "./toilWorkflow" directory specified as the jobstore:

          options = Job.Runner.getDefaultOptions("./toilWorkflow") # Get the options object

          with Toil(options) as toil:
              toil.start(Job())  # Run the script

       However,  each option can be explicitly set within the script by supplying arguments (in this example, we
       are setting logLevel = "DEBUG" (all log statements are  shown)  and  clean="ALWAYS"  (always  delete  the
       jobstore) like so:

          options = Job.Runner.getDefaultOptions("./toilWorkflow") # Get the options object
          options.logLevel = "DEBUG" # Set the log level to the debug level.
          options.clean = "ALWAYS" # Always delete the jobStore after a run

          with Toil(options) as toil:
              toil.start(Job())  # Run the script

       However, the usual incantation is to accept commandline args from the user with the following:

          parser = Job.Runner.getDefaultArgumentParser() # Get the parser
          options = parser.parse_args() # Parse user args to create the options object

          with Toil(options) as toil:
              toil.start(Job())  # Run the script

       Which can also, of course, then accept script supplied arguments as before (which will overwrite any user
       supplied args):

          parser = Job.Runner.getDefaultArgumentParser() # Get the parser
          options = parser.parse_args() # Parse user args to create the options object
          options.logLevel = "DEBUG" # Set the log level to the debug level.
          options.clean = "ALWAYS" # Always delete the jobStore after a run

          with Toil(options) as toil:
              toil.start(Job())  # Run the script

TOIL DEBUGGING

       Toil has a number of tools to assist in debugging.  Here we provide help  in  working  through  potential
       problems that a user might encounter in attempting to run a workflow.

   Introspecting the Jobstore
       Note:  Currently  these  features  are  only  implemented  for  use  locally  (single  machine)  with the
       fileJobStore.

       To view what files currently reside in the jobstore, run the following command:

          $ toil debug-file file:path-to-jobstore-directory --listFilesInJobStore

       When run from the commandline, this should generate a file containing the contents of the job  store  (in
       addition   to   displaying   a   series   of   log  messages  to  the  terminal).   This  file  is  named
       "jobstore_files.txt" by default and will be generated in the current working directory.

       If one wishes to copy any of these files to a local directory, one can run for example:

          $ toil debug-file file:path-to-jobstore --fetch overview.txt *.bam *.fastq --localFilePath=/home/user/localpath

       To fetch overview.txt, and all .bam and .fastq files.  This can be used to recover previously used  input
       and  output  files  for debugging or reuse in other workflows, or use in general debugging to ensure that
       certain outputs were imported into the jobStore.

   Stats and Status
       See cli_status for more about gathering statistics about job success, runtime, and  resource  usage  from
       workflows.

   Using a Python debugger
       If  you  execute  a workflow using the --debugWorker flag, Toil will not fork in order to run jobs, which
       means you can either use pdb, or an IDE that supports debugging Python as you would normally.  Note  that
       the  --debugWorker  flag will only work with the singleMachine batch system (the default), and not any of
       the custom job schedulers.

RUNNING IN THE CLOUD

       Toil supports Amazon Web Services (AWS) and Google Compute Engine (GCE) in the cloud and has  autoscaling
       capabilities  that can adapt to the size of your workflow, whether your workflow requires 10 instances or
       20,000.

       Toil does this by creating a virtual cluster with Apache Mesos.  Apache Mesos requires a leader  node  to
       coordinate  the  workflow,  and  worker  nodes  to execute the various tasks within the workflow.  As the
       workflow runs, Toil will "autoscale", creating and terminating workers as needed to meet the  demands  of
       the workflow.

       Once  a user is familiar with the basics of running toil locally (specifying a jobStore, and how to write
       a toil script), they can move on to the guides below to learn how to translate these workflows into cloud
       ready workflows.

   Managing a Cluster of Virtual Machines (Provisioning)
       Toil  can  launch and manage a cluster of virtual machines to run using the provisioner to run a workflow
       distributed over several nodes. The provisioner also has the ability to automatically scale  up  or  down
       the  size  of  the cluster to handle dynamic changes in computational demand (autoscaling).  Currently we
       have working provisioners with AWS and GCE (Azure support has been deprecated).

       Toil uses Apache Mesos as the batchSystemOverview.

       See here for instructions for runningAWS.

       See here for instructions for runningGCE.

   Storage (Toil jobStore)
       Toil can make use of cloud storage such as AWS or Google buckets to take care of storage needs.

       This is useful when running Toil in single machine mode on any cloud platform since it allows you to make
       use of their integrated storage systems.

       For an overview of the job store see jobStoreOverview.

       For instructions configuring a particular job store see:

       • awsJobStore

       • googleJobStore

CLOUD PLATFORMS

   Running in AWS
       Toil  jobs  can  be run on a variety of cloud platforms. Of these, Amazon Web Services (AWS) is currently
       the best-supported solution. Toil provides the clusterRef to conveniently create AWS clusters, connect to
       the  leader of the cluster, and then launch a workflow. The leader handles distributing the jobs over the
       worker nodes and autoscaling to optimize costs.

       The Running a Workflow with Autoscaling section details how to create a cluster and run a  workflow  that
       will dynamically scale depending on the workflow's needs.

       The  Static  Provisioning  section  explains how a static cluster (one that won't automatically change in
       size) can be created and provisioned (grown, shrunk, destroyed, etc.).

   Preparing your AWS environment
       To use Amazon Web Services (AWS) to run Toil or to just use S3 to host the files during  the  computation
       of a workflow, first set up and configure an account with AWS:

       1.  If necessary, create and activate an AWS account

       2.  Only  needed  once, but AWS requires that users "subscribe" to use the Container Linux by CoreOS AMI.
           You will encounter errors if this is not done.

       3.  Next, generate a key pair for AWS with the command (do NOT generate your key  pair  with  the  Amazon
           browser):

              $ ssh-keygen -t rsa

       4.  This should prompt you to save your key.  Please save it in

              ~/.ssh/id_rsa

       5.  Now move this to where your OS can see it as an authorized key:

              $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
              $ eval `ssh-agent -s`
              $ ssh-add

       6.  You'll also need to chmod your private key (good practice but also enforced by AWS):

              $ chmod 400 id_rsa

       7.  Now  you'll need to add the key to AWS via the browser.  For example, on us-west1, this address would
           accessible at:

              https://us-west-1.console.aws.amazon.com/ec2/v2/home?region=us-west-1#KeyPairs:sort=keyName

       8.  Now click on the "Import Key Pair" button to add your key:
              Adding an Amazon Key Pair.UNINDENT

           9.  Next, you need to create an  AWS  access  key.   First  go  to  the  IAM  dashboard,  again;  for
               "us-west1", the example link would be here:

                  https://console.aws.amazon.com/iam/home?region=us-west-1#/home

           10. The                       directions                      (transcribed                      from:
               https://docs.aws.amazon.com/general/latest/gr/managing-aws-access-keys.html ) are now:

                  1. On the IAM Dashboard page, choose your account name in the navigation bar, and then  choose
                     My Security Credentials.

                  2. Expand the Access keys (access key ID and secret access key) section.

                  3. Choose  Create  New Access Key. Then choose Download Key File to save the access key ID and
                     secret access key to a file on your computer. After you close the  dialog  box,  you  can't
                     retrieve this secret access key again.

           11. Now  you  should  have a newly generated "AWS Access Key ID" and "AWS Secret Access Key".  We can
               now install the AWS CLI and make sure that it has the proper credentials:

                  $ pip install awscli --upgrade --user

           12. Now configure your AWS credentials with:

                  $ aws configure

           13. Add your "AWS Access Key ID" and "AWS Secret Access Key" from earlier and your region and  output
               format:

                  " AWS Access Key ID [****************Q65Q]: "
                  " AWS Secret Access Key [****************G0ys]: "
                  " Default region name [us-west-1]: "
                  " Default output format [json]: "

           14. Toil  also  relies  on boto, and you'll need to create a boto file containing your credentials as
               well.  To do this, run:

                  $ nano ~/.boto

           15. Paste in the following (with your actual "AWS Access Key ID" and "AWS Secret Access Key"):

                  [Credentials]
                  aws_access_key_id = ****************Q65Q
                  aws_secret_access_key = ****************G0ys

           16. If not done already, install toil (example uses version  3.12.0,  but  we  recommend  the  latest
               release):

                  $ virtualenv venv
                  $ source venv/bin/activate
                  $ pip install toil[all]==3.12.0

           17. Now  that  toil  is  installed  and  you are running a virtualenv, an example of launching a toil
               leader node would be the following (again, note that we set TOIL_APPLIANCE_SELF to  toil  version
               3.12.0 in this example, but please set the version to the installed version that you are using if
               you're using a different version):

                  $ TOIL_APPLIANCE_SELF=quay.io/ucsc_cgl/toil:3.12.0 toil launch-cluster clustername --leaderNodeType t2.medium --zone us-west-1a --keyPairName id_rsa

           To further break down each of these commands:
          TOIL_APPLIANCE_SELF=quay.io/ucsc_cgl/toil:latest --- This is optional.  It specifies  a  mesos  docker
          image  that  we  maintain  with  the  latest  version  of  toil installed on it.  If you want to use a
          different    version    of    toil,    please    specify    the    image    tag    you    need    from
          https://quay.io/repository/ucsc_cgl/toil?tag=latest&tab=tags.

          toil launch-cluster --- Base command in toil to launch a cluster.

          clustername --- Just choose a name for your cluster.

          --leaderNodeType  t2.medium  ---  Specify  the  leader  node  type.   Make a t2.medium (2CPU; 4Gb RAM;
          $0.0464/Hour).  List of available AWS instances: https://aws.amazon.com/ec2/pricing/on-demand/

          --zone us-west-1a --- Specify the AWS zone you want to launch the instance in.   Must  have  the  same
          prefix  as  the  zone  in  your  awscli  credentials  (which,  in  the  example  of  this tutorial is:
          "us-west-1").

          --keyPairName id_rsa --- The name of your key pair, which should be "id_rsa" if you've  followed  this
          tutorial.

   AWS Job Store
       Using  the AWS job store is straightforward after you've finished Preparing your AWS environment; all you
       need to do is specify the prefix for the job store name.

       To run the sort example sort example with the AWS job store you would type

          $ python sort.py aws:us-west-2:my-aws-sort-jobstore

   Toil Provisioner
       The Toil provisioner is included in Toil alongside the [aws] extra and allows us to spin up a cluster.

       Getting started with the provisioner is simple:

       1. Make sure you have Toil installed with the AWS extras. For detailed instructions see extras.

       2. You will need an AWS account and you will need to save your AWS credentials on your local machine. For
          help setting up an AWS account see here. For setting up your AWS credentials follow instructions here.

       The  Toil  provisioner  is  built around the Toil Appliance, a Docker image that bundles Toil and all its
       requirements (e.g. Mesos). This makes deployment simple across platforms, and you  can  even  simulate  a
       cluster locally (see appliance_dev for details).

          Choosing Toil Appliance Image

                 When  using the Toil provisioner, the appliance image will be automatically chosen based on the
                 pip-installed version of Toil on your system. That choice can  be  overridden  by  setting  the
                 environment  variables  TOIL_DOCKER_REGISTRY  and  TOIL_DOCKER_NAME or TOIL_APPLIANCE_SELF. See
                 envars for more information on these variables. If you are developing with autoscaling and want
                 to test and build your own appliance have a look at appliance_dev.

       For information on using the Toil Provisioner have a look at Running a Workflow with Autoscaling.

   Details about Launching a Cluster in AWS
       Using  the  provisioner  to launch a Toil leader instance is simple using the launch-cluster command. For
       example, to launch a cluster named "my-cluster" with a t2.medium leader in the us-west-2a zone, run

          (venv) $ toil launch-cluster my-cluster --leaderNodeType t2.medium --zone us-west-2a --keyPairName <your-AWS-key-pair-name>

       The cluster name is used to uniquely identify your cluster and will be used to  populate  the  instance's
       Name  tag.  Also,  the  Toil  provisioner  will  automatically  tag  your  cluster with an Owner tag that
       corresponds to your keypair name to facilitate cost tracking. In addition, the ToilNodeType  tag  can  be
       used to filter "leader" vs. "worker" nodes in your cluster.

       The leaderNodeType is an EC2 instance type. This only affects the leader node.

       The  --zone parameter specifies which EC2 availability zone to launch the cluster in.  Alternatively, you
       can specify this option via the TOIL_AWS_ZONE environment variable.  Note: the zone is different from  an
       EC2  region.  A region corresponds to a geographical area like us-west-2 (Oregon), and availability zones
       are partitions of this area like us-west-2a.

       By default, Toil creates an IAM role for each cluster with  sufficient  permissions  to  perform  cluster
       operations  (e.g.  full  S3, EC2, and SDB access). If the default permissions are not sufficient for your
       use case (e.g. if you need access to  ECR),  you  may  create  a  custom  IAM  role  with  all  necessary
       permissions  and  set  the --awsEc2ProfileArn parameter when launching the cluster. Note that your custom
       role must at least have these permissions in order for the Toil cluster to function properly.

       For more information on options try:

          (venv) $ toil launch-cluster --help

   Static Provisioning
       Toil can be used to manage a cluster in the cloud by using the clusterRef.  The  cluster  utilities  also
       make  it  easy  to run a toil workflow directly on this cluster. We call this static provisioning because
       the size of the cluster does not change. This is in contrast with Running a Workflow with Autoscaling.

       To launch worker nodes alongside the leader we use the -w option:

          (venv) $ toil launch-cluster my-cluster --leaderNodeType t2.small -z us-west-2a --keyPairName your-AWS-key-pair-name --nodeTypes m3.large,t2.micro -w 1,4

       This will spin up a leader node of type t2.small with five additional workers --- one  m3.large  instance
       and four t2.micro.

       Currently  static  provisioning  is  only possible during the cluster's creation.  The ability to add new
       nodes and remove existing nodes via the native provisioner is in development. Of course the  cluster  can
       always be deleted with the destroyCluster utility.

   Uploading Workflows
       Now that our cluster is launched, we use the rsyncCluster utility to copy the workflow to the leader. For
       a simple workflow in a single file this might look like

          (venv) $ toil rsync-cluster -z us-west-2a my-cluster toil-workflow.py :/

       NOTE:
          If your toil workflow has dependencies have a  look  at  the  autoDeploying  section  for  a  detailed
          explanation on how to include them.

   Running a Workflow with Autoscaling
       Autoscaling  is  a  feature of running Toil in a cloud whereby additional cloud instances are launched to
       run the workflow.  Autoscaling leverages Mesos containers to provide an execution environment  for  these
       workflows.

       NOTE:
          Make sure you've done the AWS setup in Preparing your AWS environment.

       1. Download sort.py

       2. Launch the leader node in AWS using the launchCluster command:

             (venv) $ toil launch-cluster <cluster-name> --keyPairName <AWS-key-pair-name> --leaderNodeType t2.medium --zone us-west-2a

       3. Copy the sort.py script up to the leader node:

             (venv) $ toil rsync-cluster <cluster-name> sort.py :/root

       4. Login to the leader node:

             (venv) $ toil ssh-cluster <cluster-name>

       5. Run the script as an autoscaling workflow:

             $ python /root/sort.py aws:us-west-2:<my-jobstore-name> --provisioner aws --nodeTypes c3.large --maxNodes 2 --batchSystem mesos

       NOTE:
          In  this  example, the autoscaling Toil code creates up to two instances of type c3.large and launches
          Mesos slave containers inside them. The containers are then available  to  run  jobs  defined  by  the
          sort.py  script.   Toil  also creates a bucket in S3 called aws:us-west-2:autoscaling-sort-jobstore to
          store intermediate job results. The Toil autoscaler can also provision multiple different node  types,
          which  is  useful  for  workflows  that have jobs with varying resource requirements. For example, one
          could execute the script with --nodeTypes c3.large,r3.xlarge --maxNodes 5,1,  which  would  allow  the
          provisioner  to  create up to five c3.large nodes and one r3.xlarge node for memory-intensive jobs. In
          this situation, the autoscaler would avoid creating the more expensive r3.xlarge  node  until  needed,
          running most jobs on the c3.large nodes.

       1. View the generated file to sort:

             $ head fileToSort.txt

       2. View the sorted file:

             $ head sortedFile.txt

       For more information on other autoscaling (and other) options have a look at workflowOptions and/or run

          $ python my-toil-script.py --help

       IMPORTANT:
          Some  important  caveats  about  starting  a  toil  run  through  an  ssh session are explained in the
          sshCluster section.

   Preemptability
       Toil can run on a heterogeneous cluster of both preemptable and non-preemptable nodes. Being  preemptable
       node simply means that the node may be shut down at any time, while jobs are running. These jobs can then
       be restarted later somewhere else.

       A node type can be specified as preemptable by adding a spot bid to its entry in the list of  node  types
       provided  with  the  --nodeTypes  flag. If spot instance prices rise above your bid, the preemptable node
       whill be shut down.

       While individual jobs can each explicitly specify whether or not they should be run on preemptable  nodes
       via the boolean preemptable resource requirement, the --defaultPreemptable flag will allow jobs without a
       preemptable requirement to run on preemptable machines.

          Specify Preemptability Carefully

                 Ensure that your choices for --nodeTypes and --maxNodes <> make sense  for  your  workflow  and
                 won't  cause  it  to  hang.  You should make sure the provisioner is able to create nodes large
                 enough to run the largest job in the workflow, and that non-preemptable node types are  allowed
                 if there are non-preemptable jobs in the workflow.

       Finally,  the  --preemptableCompensation flag can be used to handle cases where preemptable nodes may not
       be available but are required for your workflow. With this flag enabled, the autoscaler will  attempt  to
       compensate  for  a  shortage  of preemptable nodes of a certain type by creating non-preemptable nodes of
       that type, if non-preemptable nodes of that type were specified in --nodeTypes.

   Dashboard
       Toil provides a dashboard for viewing the RAM and CPU usage of each node, the number of  issued  jobs  of
       each type, the number of failed jobs, and the size of the jobs queue. To launch this dashboard for a toil
       workflow, include the --metrics flag in the toil script command. The dashboard can then be viewed in your
       browser  at  localhost:3000  while  connected  to  the leader node through toil ssh-cluster.  On AWS, the
       dashboard keeps track of every node in the cluster to monitor CPU and RAM usage, but it can also be  used
       while  running a workflow on a single machine. The dashboard uses Grafana as the front end for displaying
       real-time plots, and Prometheus for tracking metrics exported by toil. In order to use the dashboard  for
       a  non-released  toil  version, you will have to build the containers locally with make docker, since the
       prometheus, grafana, and mtail containers used in the dashboard are tied to a specific toil version.

   Running in Google Compute Engine (GCE)
       Toil supports a provisioner with Google, and a Google Job Store. To get started, follow instructions  for
       Preparing your Google environment.

   Preparing your Google environment
       Toil supports using the Google Cloud Platform. Setting this up is easy!

       1. Make sure that the google extra (extras) is installed

       2. Follow  Google's  Instructions  to  download  credentials  and  set the GOOGLE_APPLICATION_CREDENTIALS
          environment variable

       3. Create a new ssh key with the proper format.  To create a new ssh key run the command

             $ ssh-keygen -t rsa -f ~/.ssh/id_rsa -C [USERNAME]

          where [USERNAME] is something like jane@example.com. Make sure to leave your password blank.

          WARNING:
             This command could overwrite an old ssh key you may be using.  If you have an existing ssh key  you
             would like to use, it will need to be called id_rsa and it needs to have no password set.

          Make sure only you can read the SSH keys:

             $ chmod 400 ~/.ssh/id_rsa ~/.ssh/id_rsa.pub

       4. Add  your  newly formatted public key to Google. To do this, log into your Google Cloud account and go
          to metadata section under the Compute tab.  [image]

          Near the top of the screen click on 'SSH Keys', then edit, add item, and paste  the  key.  Then  save:
          [image]

       For more details look at Google's instructions for adding SSH keys.

   Google Job Store
       To  use the Google Job Store you will need to set the GOOGLE_APPLICATION_CREDENTIALS environment variable
       by following Google's instructions.

       Then to run the sort example with the Google job store you would type

          $ python sort.py google:my-project-id:my-google-sort-jobstore

   Running a Workflow with Autoscaling
       WARNING:
          Google Autoscaling is in beta!

       The steps to run a GCE workflow are similar to those of  AWS  (Autoscaling),  except  you  will  need  to
       explicitly specify the --provisioner gce option which otherwise defaults to aws.

       1. Download sort.py

       2. Launch the leader node in GCE using the launchCluster command:

             (venv) $ toil launch-cluster <CLUSTER-NAME> --provisioner gce --leaderNodeType n1-standard-1 --keyPairName <SSH-KEYNAME> --zone us-west1-a

          Where <SSH-KEYNAME> is the first part of [USERNAME] used when setting up your ssh key.  For example if
          [USERNAME] was jane@example.com, <SSH-KEYNAME> should be jane.

          The --keyPairName option is for an SSH key that was added to the  Google  account.  If  your  ssh  key
          [USERNAME] was jane@example.com, then your key pair name will be just jane.

       3. Upload the sort example and ssh into the leader:

             (venv) $ toil rsync-cluster --provisioner gce <CLUSTER-NAME> sort.py :/root
             (venv) $ toil ssh-cluster --provisioner gce <CLUSTER-NAME>

       4. Run the workflow:

             $ python /root/sort.py  google:<PROJECT-ID>:<JOBSTORE-NAME> --provisioner gce --batchSystem mesos --nodeTypes n1-standard-2 --maxNodes 2

       5. Clean up:

             $ exit  # this exits the ssh from the leader node
             (venv) $ toil destroy-cluster --provisioner gce <CLUSTER-NAME>

   Cluster Utilities
       There are several utilities used for starting and managing a Toil cluster using the AWS provisioner. They
       are installed via the [aws] or [google] extra.  For  installation  details  see  installProvisioner.  The
       cluster  utilities  are used for runningAWS and are comprised of toil launch-cluster, toil rsync-cluster,
       toil ssh-cluster, and toil destroy-cluster entry points.

       Cluster commands specific to toil are:
          status --- Reports runtime and resource usage for all jobs in a specified jobstore (workflow must have
          originally been run using the -\-stats option).

          stats --- Inspects a job store to see which jobs have failed, run successfully, etc.

          destroy-cluster --- For autoscaling.  Terminates the specified cluster and associated resources.

          launch-cluster  --- For autoscaling.  This is used to launch a toil leader instance with the specified
          provisioner.

          rsync-cluster ---  For  autoscaling.   Used  to  transfer  files  to  a  cluster  launched  with  toil
          launch-cluster.

          ssh-cluster --- SSHs into the toil appliance container running on the leader of the cluster.

          clean --- Delete the job store used by a previous Toil workflow invocation.

          kill --- Kills any running jobs in a rogue toil.

       For information on a specific utility run:

          toil launch-cluster --help

       for a full list of its options and functionality.

       The cluster utilities can be used for runningGCE and runningAWS.

       TIP:
          By default, all of the cluster utilities expect to be running on AWS. To run with Google you will need
          to specify the --provisioner gce option for each utility.

       NOTE:
          Boto must be configured with AWS credentials before using cluster utilities.

          runningGCE contains instructions for

   Stats Command
       To use the stats command, a workflow must first be run using the  --stats  option.   Using  this  command
       makes  certain  that toil does not delete the job store, no matter what other options are specified (i.e.
       normally the option --clean=always would delete the job, but --stats will override this).

       An example of this would be running the following:

          python discoverfiles.py file:my-jobstore --stats

       Where discoverfiles.py is the following:

          import subprocess
          import os
          from toil.common import Toil
          from toil.job import Job

          class discoverFiles(Job):
              """Views files at a specified path using ls."""
              def __init__(self, path, *args, **kwargs):
                  self.path = path
                  super(discoverFiles, self).__init__(*args, **kwargs)

              def run(self, fileStore):
                  if os.path.exists(self.path):
                      subprocess.check_call(["ls", self.path])

          def main():
              options = Job.Runner.getDefaultArgumentParser().parse_args()
              options.clean = "always"

              job1 = discoverFiles(path="/sys/", displayName='sysFiles')
              job2 = discoverFiles(path=os.path.expanduser("~"), displayName='userFiles')
              job3 = discoverFiles(path="/tmp/")

              job1.addChild(job2)
              job2.addChild(job3)

              with Toil(options) as toil:
                  if not toil.options.restart:
                      toil.start(job1)
                  else:
                      toil.restart()

          if __name__ == '__main__':
              main()

       Notice the displayName key, which can rename a job, giving it an alias when it is  finally  displayed  in
       stats.   Running this workflow file should record three job names: sysFiles (job1), userFiles (job2), and
       discoverFiles (job3).  To see the runtime and resources used for each job when it was run, type

          toil stats file:my-jobstore

       This should output the following:

          Batch System: singleMachine
          Default Cores: 1  Default Memory: 2097152K
          Max Cores: 9.22337e+18
          Total Clock: 0.56  Total Runtime: 1.01
          Worker
              Count |                                    Time* |                                    Clock |                                     Wait |                                   Memory
                  n |      min    med*     ave     max   total |      min     med     ave     max   total |      min     med     ave     max   total |      min     med     ave     max   total
                  1 |     0.14    0.14    0.14    0.14    0.14 |     0.13    0.13    0.13    0.13    0.13 |     0.01    0.01    0.01    0.01    0.01 |      76K     76K     76K     76K     76K
          Job
           Worker Jobs  |     min    med    ave    max
                        |       3      3      3      3
              Count |                                    Time* |                                    Clock |                                     Wait |                                   Memory
                  n |      min    med*     ave     max   total |      min     med     ave     max   total |      min     med     ave     max   total |      min     med     ave     max   total
                  3 |     0.01    0.06    0.05    0.07    0.14 |     0.00    0.06    0.04    0.07    0.12 |     0.00    0.01    0.00    0.01    0.01 |      76K     76K     76K     76K    229K
           sysFiles
              Count |                                    Time* |                                    Clock |                                     Wait |                                   Memory
                  n |      min    med*     ave     max   total |      min     med     ave     max   total |      min     med     ave     max   total |      min     med     ave     max   total
                  1 |     0.01    0.01    0.01    0.01    0.01 |     0.00    0.00    0.00    0.00    0.00 |     0.01    0.01    0.01    0.01    0.01 |      76K     76K     76K     76K     76K
           userFiles
              Count |                                    Time* |                                    Clock |                                     Wait |                                   Memory
                  n |      min    med*     ave     max   total |      min     med     ave     max   total |      min     med     ave     max   total |      min     med     ave     max   total
                  1 |     0.06    0.06    0.06    0.06    0.06 |     0.06    0.06    0.06    0.06    0.06 |     0.01    0.01    0.01    0.01    0.01 |      76K     76K     76K     76K     76K
           discoverFiles
              Count |                                    Time* |                                    Clock |                                     Wait |                                   Memory
                  n |      min    med*     ave     max   total |      min     med     ave     max   total |      min     med     ave     max   total |      min     med     ave     max   total
                  1 |     0.07    0.07    0.07    0.07    0.07 |     0.07    0.07    0.07    0.07    0.07 |     0.00    0.00    0.00    0.00    0.00 |      76K     76K     76K     76K     76K

       Once we're done, we can clean up the job store by running

          toil clean file:my-jobstore

   Status Command
       Continuing the example from the stats section above, if we ran our workflow with the command

          python discoverfiles.py file:my-jobstore --stats

       We could interrogate our jobstore with the status command, for example:

          toil status file:my-jobstore

       If the run was successful, this would not return much valuable information, something like

          2018-01-11 19:31:29,739 - toil.lib.bioio - INFO - Root logger is at level 'INFO', 'toil' logger at level 'INFO'.
          2018-01-11 19:31:29,740 - toil.utils.toilStatus - INFO - Parsed arguments
          2018-01-11 19:31:29,740 - toil.utils.toilStatus - INFO - Checking if we have files for Toil
          The root job of the job store is absent, the workflow completed successfully.

       Otherwise, the status command should return the following:
          There are x unfinished jobs, y parent jobs with children, z jobs with  services,  a  services,  and  b
          totally failed jobs currently in  c.

   Clean Command
       If  a Toil pipeline didn't finish successfully, or was run using --clean=always or --stats, the job store
       will exist until it is deleted. toil clean <jobStore> ensures that all artifacts associated  with  a  job
       store are removed.  This is particularly useful for deleting AWS job stores, which reserves an SDB domain
       as well as an S3 bucket.

       The deletion of the job store can be modified by the --clean argument, and may be set to always, onError,
       never, or onSuccess (default).

       Temporary  directories  where  jobs are running can also be saved from deletion using the --cleanWorkDir,
       which has the same options as --clean.  This option should only be run when  debugging,  as  intermediate
       jobs will fill up disk space.

   Launch-Cluster Command
       Running toil launch-cluster starts up a leader for a cluster. Workers can be added to the initial cluster
       by specifying the -w option.  An example would be

          $ toil launch-cluster my-cluster --leaderNodeType t2.small -z us-west-2a --keyPairName your-AWS-key-pair-name --nodeTypes m3.large,t2.micro -w 1,4

       Options are listed below.  These can also be displayed by running

          $ toil launch-cluster --help

       launch-cluster's main positional argument is the clusterName.  This is simply the name of  your  cluster.
       If it does not exist yet, Toil will create it for you.

       Launch-Cluster Options

          --help -h also accepted.  Displays this help menu.

          --tempDirRoot TEMPDIRROOT
                 Path  to  the temporary directory where all temp files are created, by default uses the current
                 working directory as the base.

          --version
                 Display version.

          --provisioner CLOUDPROVIDER
                 -p CLOUDPROVIDER also accepted.  The provisioner for cluster auto-scaling.  Both  AWS  and  GCE
                 are currently supported.

          --zone ZONE
                 -z ZONE also accepted.  The availability zone of the leader. This parameter can also be set via
                 the TOIL_AWS_ZONE or TOIL_GCE_ZONE environment variables, or by the  ec2_region_name  parameter
                 in your .boto file if using AWS, or derived from the instance metadata if using this utility on
                 an existing EC2 instance.

          --leaderNodeType LEADERNODETYPE
                 Non-preemptable node type to use for the cluster leader.

          --keyPairName KEYPAIRNAME
                 The name of the AWS or ssh key pair to include on the instance.

          --boto BOTOPATH
                 The path to the boto credentials directory. This is transferred to all nodes in order to access
                 the AWS jobStore from non-AWS instances.

          --tag KEYVALUE
                 KEYVALUE  is  specified  as  KEY=VALUE.  -t KEY=VALUE also accepted.  Tags are added to the AWS
                 cluster for this node and all of its children.  Tags are of  the  form:  -t  key1=value1  --tag
                 key2=value2.  Multiple tags are allowed and each tag needs its own flag. By default the cluster
                 is tagged with: { "Name": clusterName, "Owner": IAM username }.

          --vpcSubnet VPCSUBNET
                 VPC subnet ID to launch cluster in. Uses default subnet if not specified. This subnet needs  to
                 have auto assign IPs turned on.

          --nodeTypes NODETYPES
                 Comma-separated  list  of  node types to create while launching the leader. The syntax for each
                 node type depends on the provisioner used. For the AWS provisioner this is the name of  an  EC2
                 instance  type  followed  by  a  colon and the price in dollars to bid for a spot instance, for
                 example 'c3.8xlarge:0.42'. Must also provide the --workers argument to specify how many workers
                 of each node type to create.

          --workers WORKERS
                 -w  WORKERS  also accepted.  Comma-separated list of the number of workers of each node type to
                 launch alongside the leader when the cluster is created. This can be  useful  if  running  toil
                 without auto-scaling but with need of more hardware support.

          --leaderStorage LEADERSTORAGE
                 Specify  the  size  (in  gigabytes)  of the root volume for the leader instance. This is an EBS
                 volume.

          --nodeStorage NODESTORAGE
                 Specify the size (in gigabytes) of the root volume for any worker instances created when  using
                 the -w flag.  This is an EBS volume.

       Logging Options

          --logOff
                 Same as -\-logCritical.

          --logCritical
                 Turn on logging at level CRITICAL and above. (default is INFO)

          --logError
                 Turn on logging at level ERROR and above. (default is INFO)

          --logWarning
                 Turn on logging at level WARNING and above. (default is INFO)

          --logInfo
                 Turn on logging at level INFO and above. (default is INFO)

          --logDebug
                 Turn on logging at level DEBUG and above. (default is INFO)

          --logLevel LOGLEVEL
                 Log  at given level (may be either OFF (or CRITICAL), ERROR, WARN (or WARNING), INFO or DEBUG).
                 (default is INFO)

          --logFile LOGFILE
                 File to log in.

          --rotatingLogging
                 Turn on rotating logging, which prevents log files getting too big.

   Ssh-Cluster Command
       Toil provides the ability to ssh into the leader of the cluster. This can be done as follows:

          $ toil ssh-cluster CLUSTER-NAME-HERE

       This will open a shell on the Toil leader and is used to start an Autoscaling  run.  Issues  with  docker
       prevent  using  screen  and  tmux  when sshing the cluster (The shell doesn't know that it is a TTY which
       prevents it from allocating a new screen session). This can be worked around via

          $ script
          $ screen

       Simply running screen within script will get things working properly again.

       Finally, you can execute remote commands with the following syntax:

          $ toil ssh-cluster CLUSTER-NAME-HERE remoteCommand

       It is not advised that you run your Toil workflow using remote execution like this  unless  a  tool  like
       nohup is used to ensure the process does not die if the SSH connection is interrupted.

       For an example usage, see Autoscaling.

   Rsync-Cluster Command
       The  most  frequent  use  case  for  the  rsync-cluster utility is deploying your Toil script to the Toil
       leader. Note that the syntax is the same as traditional rsync with the exception of the  hostname  before
       the  colon.  This  is  not needed in toil rsync-cluster since the hostname is automatically determined by
       Toil.

       Here is an example of its usage:

          $ toil rsync-cluster CLUSTER-NAME-HERE \
             ~/localFile :/remoteDestination

   Destroy-Cluster Command
       The destroy-cluster command is the advised way to  get  rid  of  any  Toil  cluster  launched  using  the
       Launch-Cluster  Command  command.  It ensures that all attached nodes, volumes, security groups, etc. are
       deleted. If a node or cluster is shut down using Amazon's online portal residual resources may  still  be
       in use in the background. To delete a cluster run

          $ toil destroy-cluster CLUSTER-NAME-HERE

   Kill Command
       To kill all currently running jobs for a given jobstore, use the command

          toil kill file:my-jobstore

HPC ENVIRONMENTS

       Toil   is  a  flexible  framework  that  can  be  leveraged  in  a  variety  of  environments,  including
       high-performance computing (HPC) environments.  Toil provides support for  a  number  of  batch  systems,
       including  Grid  Engine,  Slurm, Torque and LSF, which are popular schedulers used in these environments.
       Toil also supports HTCondor, which is a popular scheduler for high-throughput computing  (HTC).   To  use
       one of these batch systems specify the "-\-batchSystem" argument to the toil script.

       Due  to the cost and complexity of maintaining support for these schedulers we currently consider them to
       be "community supported", that is the core development team does not regularly test  or  develop  support
       for  these  systems.  However,  there  are  members of the Toil community currently deploying Toil in HPC
       environments and we welcome external contributions.

       Developing the support of a new or existing batch system involves extending  the  abstract  batch  system
       class toil.batchSystems.abstractBatchSystem.AbstractBatchSystem.

   Standard Output/Error from Batch System Jobs
       Standard  output  and  error  from batch system jobs (except for the Parasol and Mesos batch systems) are
       redirected to files in the toil-<workflowID> directory created within the temporary  directory  specified
       by   the  --workDir  option;  see  optionsRef.   Each  file  is  named  as  follows:  toil_job_<Toil  job
       ID>_batch_<name  of  batch  system>_<job  ID  from  batch  system>_<file  description>.log,  where  <file
       description>  is  std_output  for  standard output, and std_error for standard error.  HTCondor will also
       write job event log files with <file description> = job_events.

       If capturing standard output and error is desired, --workDir will generally need to be on a  shared  file
       system;  otherwise if these are written to local temporary directories on each node (e.g. /tmp) Toil will
       not be able to retrieve them.  Alternatively,  the  --noStdOutErr  option  forces  Toil  to  discard  all
       standard output and error from batch system jobs.

CWL IN TOIL

       The Common Workflow Language (CWL) is an emerging standard for writing workflows that are portable across
       multiple workflow engines and platforms.  Toil has full support for the CWL v1.0.1 specification.

   Running CWL Locally
       The  toil-cwl-runner  command  provides  cwl-parsing  functionality  using  cwltool,  and  leverages  the
       job-scheduling and batch system support of Toil.

       To run in local batch mode, provide the CWL file and the input object file:

          $ toil-cwl-runner example.cwl example-job.yml

       For a simple example of CWL with Toil see cwlquickstart.

   Note for macOS + Docker + Toil
       When invoking CWL documents that make use of Docker containers if you see errors that look like

          docker: Error response from daemon: Mounts denied:
          The paths /var/...tmp are not shared from OS X and are not known to Docker.

       you may need to add

          export TMPDIR=/tmp/docker_tmp

       either in your startup file (.bashrc) or add it manually in your shell before invoking toil.

   Detailed Usage Instructions
       Help information can be found by using this toil command:

          $ toil-cwl-runner -h

       A more detailed example shows how we can specify both Toil and cwltool arguments for our workflow:

          $ toil-cwl-runner \
              --singularity \
              --jobStore my_jobStore \
              --batchSystem lsf \
              --workDir `pwd` \
              --outdir `pwd` \
              --logFile cwltoil.log \
              --writeLogs `pwd` \
              --logLevel DEBUG \
              --retryCount 2 \
              --disableCaching \
              --maxLogFileSize 20000000000 \
              --stats \
              standard_bam_processing.cwl \
              inputs.yaml

       In this example, we set the following options, which are all passed to Toil:

       --singularity:  Specifies  that  all jobs with Docker fornat containers specified should be run using the
       Singularity container engine instead of the Docker container engine.

       --jobStore: Path to a folder that already exists, which will contain the Toil jobstore  and  all  related
       job-tracking information.

       --batchSystem: Use the specified HPC or Cloud-based cluster platform.

       --workDir:  The  directory  where all temporary files will be created for the workflow. A subdirectory of
       this will be set as the $TMPDIR environment variable and this subdirectory can be  referenced  using  the
       CWL parameter reference $(runtime.tmpdir) in CWL tools and workflows.

       --outdir: Directory where final File and Directory outputs will be written. References to these and other
       output types will be in the JSON object printed to the stdout stream after workflow execution.

       --logFile: Path to the main logfile with logs from all jobs.

       --writeLogs: Directory where all job logs will be stored.

       --retryCount: How many times to retry each Toil job.

       --disableCaching: Currently required for batch systems (LSF, slurm, gridengine, htcondor, torque)

       --maxLogFileSize: Logs that get larger than this value will be truncated.

       --stats: Save resources usages in json files that can be collected with the toil stats command after  the
       workflow is done.

   Running CWL in the Cloud
       To  run  in  cloud  and HPC configurations, you may need to provide additional command line parameters to
       select and configure the batch system to use.

       To run a CWL workflow in AWS with toil see awscwl.

   Running CWL within Toil Scripts
       A CWL workflow can be run indirectly in a native Toil script. However, this is not the  standard  way  to
       run CWL workflows with Toil and doing so comes at the cost of job efficiency. For some use cases, such as
       running one process on multiple files, it may be useful. For example, if you want to run a  CWL  workflow
       with 3 YML files specifying different samples inputs, it could look something like:

          from toil.job import Job
          from toil.common import Toil
          import subprocess
          import os

          def initialize_jobs(job):
              job.fileStore.logToMaster('initialize_jobs')

          def runQC(job, cwl_file, cwl_filename, yml_file, yml_filename, outputs_dir, output_num):
              job.fileStore.logToMaster("runQC")
              tempDir = job.fileStore.getLocalTempDir()

              cwl = job.fileStore.readGlobalFile(cwl_file, userPath=os.path.join(tempDir, cwl_filename))
              yml = job.fileStore.readGlobalFile(yml_file, userPath=os.path.join(tempDir, yml_filename))

              subprocess.check_call(["cwltoil", cwl, yml])

              output_filename = "output.txt"
              output_file = job.fileStore.writeGlobalFile(output_filename)
              job.fileStore.readGlobalFile(output_file, userPath=os.path.join(outputs_dir, "sample_" + output_num + "_" + output_filename))
              return output_file

          if __name__ == "__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"
              with Toil(options) as toil:

                  # specify the folder where the cwl and yml files live
                  inputs_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "cwlExampleFiles")
                  # specify where you wish the outputs to be written
                  outputs_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "cwlExampleFiles")

                  job0 = Job.wrapJobFn(initialize_jobs)

                  cwl_filename = "hello.cwl"
                  cwl_file = toil.importFile("file://" + os.path.abspath(os.path.join(inputs_dir, cwl_filename)))

                  # add list of yml config inputs here or import and construct from file
                  yml_files = ["hello1.yml", "hello2.yml", "hello3.yml"]
                  i = 0
                  for yml in yml_files:
                      i = i + 1
                      yml_file = toil.importFile("file://" + os.path.abspath(os.path.join(inputs_dir, yml)))
                      yml_filename = yml
                      job = Job.wrapJobFn(runQC, cwl_file, cwl_filename, yml_file, yml_filename, outputs_dir, output_num=str(i))
                      job0.addChild(job)

                  toil.start(job0)

   Toil & CWL Tips
       See logs for just one job by using the full log file

       This requires knowing the job's toil-generated ID, which can be found in the log files.

          cat cwltoil.log | grep jobVM1fIs

       Grep for full tool commands from toil logs

       This  gives  you  a  more  concise  view  of  the  commands being run (note that this information is only
       available from Toil when running with --logDebug).

          pcregrep -M "\[job .*\.cwl.*$\n(.*        .*$\n)*" cwltoil.log
          #         ^allows for multiline matching

       Find Bams that have been generated for specific step while pipeline is running:

          find . | grep -P '^./out_tmpdir.*_MD\.bam$'

       See what jobs have been run

          cat log/cwltoil.log | grep -oP "\[job .*.cwl\]" | sort | uniq

       or:

          cat log/cwltoil.log | grep -i "issued job"

       Get status of a workflow

          $ toil status /home/johnsoni/TEST_RUNS_3/TEST_run/tmp/jobstore-09ae0acc-c800-11e8-9d09-70106fb1697e
          <hostname> 2018-10-04 15:01:44,184 MainThread INFO toil.lib.bioio: Root logger is at level 'INFO', 'toil' logger at level 'INFO'.
          <hostname> 2018-10-04 15:01:44,185 MainThread INFO toil.utils.toilStatus: Parsed arguments
          <hostname> 2018-10-04 15:01:47,081 MainThread INFO toil.utils.toilStatus: Traversing the job graph gathering jobs. This may take a couple of minutes.

          Of the 286 jobs considered, there are 179 jobs with children, 107 jobs ready to run, 0 zombie jobs, 0 jobs with services, 0 services, and 0 jobs with log files currently in file:/home/user/jobstore-09ae0acc-c800-11e8-9d09-70106fb1697e.

       Toil Stats

       You can get run statistics broken down by CWL file. This only works once the workflow is finished:

          $ toil stats /path/to/jobstore

       The output will contain CPU, memory, and walltime information for all CWL job types:

          <hostname> 2018-10-15 12:06:19,003 MainThread INFO toil.lib.bioio: Root logger is at level 'INFO', 'toil' logger at level 'INFO'.
          <hostname> 2018-10-15 12:06:19,004 MainThread INFO toil.utils.toilStats: Parsed arguments
          <hostname> 2018-10-15 12:06:19,004 MainThread INFO toil.utils.toilStats: Checking if we have files for toil
          <hostname> 2018-10-15 12:06:19,004 MainThread INFO toil.utils.toilStats: Checked arguments
          Batch System: lsf
          Default Cores: 1  Default Memory: 10485760K
          Max Cores: 9.22337e+18
          Total Clock: 106608.01  Total Runtime: 86634.11
          Worker
              Count |                                       Time* |                                        Clock |                                              Wait |                                    Memory
                  n |      min    med*     ave      max     total |      min     med      ave      max     total |        min      med       ave      max      total |      min     med     ave     max    total
               1659 |     0.00    0.80  264.87 12595.59 439424.40 |     0.00    0.46   449.05 42240.74 744968.80 |  -35336.69     0.16   -184.17  4230.65 -305544.39 |      48K    223K   1020K  40235K 1692300K
          Job
           Worker Jobs  |     min    med    ave    max
                        |    1077   1077   1077   1077
              Count |                                       Time* |                                        Clock |                                              Wait |                                    Memory
                  n |      min    med*     ave      max     total |      min     med      ave      max     total |        min      med       ave      max      total |      min     med     ave     max    total
               1077 |     0.04    1.18  407.06 12593.43 438404.73 |     0.01    0.28   691.17 42240.35 744394.14 |  -35336.83     0.27   -284.11  4230.49 -305989.41 |     135K    268K   1633K  40235K 1759734K
           ResolveIndirect
              Count |                                       Time* |                                        Clock |                                              Wait |                                    Memory
                  n |      min    med*     ave      max     total |      min     med      ave      max     total |        min      med       ave      max      total |      min     med     ave     max    total
                205 |     0.04    0.07    0.16     2.29     31.95 |     0.01    0.02     0.02     0.14      3.60 |       0.02     0.05      0.14     2.28      28.35 |     190K    266K    256K    314K   52487K
           CWLGather
              Count |                                       Time* |                                        Clock |                                              Wait |                                    Memory
                  n |      min    med*     ave      max     total |      min     med      ave      max     total |        min      med       ave      max      total |      min     med     ave     max    total
                 40 |     0.05    0.17    0.29     1.90     11.62 |     0.01    0.02     0.02     0.05      0.80 |       0.03     0.14      0.27     1.88      10.82 |     188K    265K    250K    316K   10039K
           CWLWorkflow
              Count |                                       Time* |                                        Clock |                                              Wait |                                    Memory
                  n |      min    med*     ave      max     total |      min     med      ave      max     total |        min      med       ave      max      total |      min     med     ave     max    total
                205 |     0.09    0.40    0.98    13.70    200.82 |     0.04    0.15     0.16     1.08     31.78 |       0.04     0.26      0.82    12.62     169.04 |     190K    270K    257K    316K   52826K
           file:///home/johnsoni/pipeline_0.0.39/ACCESS-Pipeline/cwl_tools/expression_tools/group_waltz_files.cwl
              Count |                                       Time* |                                        Clock |                                              Wait |                                    Memory
                  n |      min    med*     ave      max     total |      min     med      ave      max     total |        min      med       ave      max      total |      min     med     ave     max    total
                 99 |     0.29    0.49    0.59     2.50     58.11 |     0.14    0.26     0.29     1.04     28.95 |       0.14     0.22      0.29     1.48      29.16 |     135K    135K    135K    136K   13459K
           file:///home/johnsoni/pipeline_0.0.39/ACCESS-Pipeline/cwl_tools/expression_tools/make_sample_output_dirs.cwl
              Count |                                       Time* |                                        Clock |                                              Wait |                                    Memory
                  n |      min    med*     ave      max     total |      min     med      ave      max     total |        min      med       ave      max      total |      min     med     ave     max    total
                 11 |     0.34    0.52    0.74     2.63      8.18 |     0.20    0.30     0.41     1.17      4.54 |       0.14     0.20      0.33     1.45       3.65 |     136K    136K    136K    136K    1496K
           file:///home/johnsoni/pipeline_0.0.39/ACCESS-Pipeline/cwl_tools/expression_tools/consolidate_files.cwl
              Count |                                       Time* |                                        Clock |                                              Wait |                                    Memory
                  n |      min    med*     ave      max     total |      min     med      ave      max     total |        min      med       ave      max      total |      min     med     ave     max    total
                  8 |     0.31    0.59    0.71     1.80      5.69 |     0.18    0.35     0.37     0.63      2.94 |       0.13     0.27      0.34     1.17       2.75 |     136K    136K    136K    136K    1091K
           file:///home/johnsoni/pipeline_0.0.39/ACCESS-Pipeline/cwl_tools/bwa-mem/bwa-mem.cwl
              Count |                                       Time* |                                        Clock |                                              Wait |                                    Memory
                  n |      min    med*     ave      max     total |      min     med      ave      max     total |        min      med       ave      max      total |      min     med     ave     max    total
                 22 |   895.76 3098.13 3587.34 12593.43  78921.51 |  2127.02 7910.31  8123.06 16959.13 178707.34 |  -11049.84 -3827.96  -4535.72    19.49  -99785.83 |    5659K   5950K   5854K   6128K  128807K

       Understanding toil log files

       There is a worker_log.txt file for each job, this file is written  to  while  the  job  is  running,  and
       deleted  after  the  job finishes. The contents are printed to the main log file and transferred to a log
       file in the --logDir folder once the job is completed successfully.

       The new log file will be named something like:

          file:<path to cwl tool>.cwl_<job ID>.log

          file:---home-johnsoni-pipeline_1.1.14-ACCESS--Pipeline-cwl_tools-marianas-ProcessLoopUMIFastq.cwl_I-O-jobfGsQQw000.log

       This is the toil job command with spaces replaced by dashes.

WDL IN TOIL

       Support is still in the alpha phase and should be able to handle basic wdl files.  See the  specification
       below for more details.

   How to Run a WDL file in Toil
       Recommended  best  practice  when  running  wdl  files  is  to  first  use the Broad's wdltool for syntax
       validation and generating the needed json input file.  Full documentation can be found on the repository,
       and a precompiled jar binary can be downloaded here: wdltool (this requires java7).

       That means two steps.  First, make sure your wdl file is valid and devoid of syntax errors by running

       java -jar wdltool.jar validate example_wdlfile.wdl

       Second,  generate  a complementary json file if your wdl file needs one.  This json will contain keys for
       every necessary input that your wdl file needs to run:

       java -jar wdltool.jar inputs example_wdlfile.wdl

       When this json template is generated, open the file, and fill in values as necessary by hand.  WDL  files
       all  require json files to accompany them.  If no variable inputs are needed, a json file containing only
       '{}' may be required.

       Once a wdl file is validated and has an appropriate json file, workflows can be run in toil using:

       toil-wdl-runner example_wdlfile.wdl example_jsonfile.json

       See options below for more parameters.

   ENCODE Example from ENCODE-DCC
       To follow this example, you will need docker  installed.   The  original  workflow  can  be  found  here:
       https://github.com/ENCODE-DCC/pipeline-container

       We've  included  the  wdl  file and data files in the toil repository needed to run this example.  First,
       download the example code and unzip.  The file needed is "testENCODE/encode_mapping_workflow.wdl".

       Next, use wdltool (this requires java7) to validate this file:

       java -jar wdltool.jar validate encode_mapping_workflow.wdl

       Next, use wdltool to generate a json file for this wdl file:

       java -jar wdltool.jar inputs encode_mapping_workflow.wdl

       This json file once opened should look like this:

          {
          "encode_mapping_workflow.fastqs": "Array[File]",
          "encode_mapping_workflow.trimming_parameter": "String",
          "encode_mapping_workflow.reference": "File"
          }

       The trimming_parameter should be set to 'native'.  Download the example code and unzip.  Inside  are  two
       data files required for the run

       ENCODE_data/reference/GRCh38_chr21_bwa.tar.gz ENCODE_data/ENCFF000VOL_chr21.fq.gz

       Editing the json to include these as inputs, the json should now look something like this:

          {
          "encode_mapping_workflow.fastqs": ["/path/to/unzipped/ENCODE_data/ENCFF000VOL_chr21.fq.gz"],
          "encode_mapping_workflow.trimming_parameter": "native",
          "encode_mapping_workflow.reference": "/path/to/unzipped/ENCODE_data/reference/GRCh38_chr21_bwa.tar.gz"
          }

       The wdl and json files can now be run using the command

       toil-wdl-runner encode_mapping_workflow.wdl encode_mapping_workflow.json

       This  should  deposit the output files in the user's current working directory (to change this, specify a
       new directory with the '-o' option).

   GATK Examples from the Broad
       Simple   examples   of   WDL    can    be    found    on    the    Broad's    website    as    tutorials:
       https://software.broadinstitute.org/wdl/documentation/topic?name=wdl-tutorials.

       One  can  follow  along  with these tutorials, write their own wdl files following the directions and run
       them using either cromwell or toil.  For example, in tutorial 1, if you've followed along and named  your
       wdl  file 'helloHaplotypeCall.wdl', then once you've validated your wdl file using wdltool (this requires
       java7) using

       java -jar wdltool.jar validate helloHaplotypeCaller.wdl

       and generated a json file (and subsequently typed in appropriate filepaths* and variables) using

       java -jar wdltool.jar inputs helloHaplotypeCaller.wdl

       • Absolute filepath inputs are recommended for local testing.

       then the wdl script can be run using

       toil-wdl-runner helloHaplotypeCaller.wdl helloHaplotypeCaller_inputs.json

   toilwdl.py Options
       '-o' or '-\-output_directory': Specifies the output folder, and defaults to the current working directory
       if not specified by the user.

       '-\-gen_parse_files':  Creates  "AST.out",  which holds a printed AST of the wdl file and "mappings.out",
       which holds the printed task, workflow, csv, and tsv dictionaries generated by the parser.

       '-\-dont_delete_compiled': Saves the compiled toil python workflow file for debugging.

       Any number of arbitrary options may also be specified.  These options will not be parsed immediately, but
       passed  down  as  toil  options  once  the wdl/json files are processed.  For valid toil options, see the
       documentation: http://toil.readthedocs.io/en/latest/running/cliOptions.html

   Running WDL within Toil Scripts
       NOTE:
          A cromwell.jar file is needed in order to run a WDL workflow.

       A WDL workflow can be run indirectly in a native Toil script. However, this is not the  standard  way  to
       run WDL workflows with Toil and doing so comes at the cost of job efficiency. For some use cases, such as
       running one process on multiple files, it may be useful. For example, if you want to run a  WDL  workflow
       with 3 JSON files specifying different samples inputs, it could look something like:

          from toil.job import Job
          from toil.common import Toil
          import subprocess
          import os

          def initialize_jobs(job):
              job.fileStore.logToMaster("initialize_jobs")

          def runQC(job, wdl_file, wdl_filename, json_file, json_filename, outputs_dir, jar_loc,output_num):
              job.fileStore.logToMaster("runQC")
              tempDir = job.fileStore.getLocalTempDir()

              wdl = job.fileStore.readGlobalFile(wdl_file, userPath=os.path.join(tempDir, wdl_filename))
              json = job.fileStore.readGlobalFile(json_file, userPath=os.path.join(tempDir, json_filename))

              subprocess.check_call(["java","-jar",jar_loc,"run",wdl,"--inputs",json])

              output_filename = "output.txt"
              output_file = job.fileStore.writeGlobalFile(outputs_dir + output_filename)
              job.fileStore.readGlobalFile(output_file, userPath=os.path.join(outputs_dir, "sample_" + output_num + "_" + output_filename))
              return output_file

          if __name__ == "__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              with Toil(options) as toil:

                  # specify the folder where the wdl and json files live
                  inputs_dir = "wdlExampleFiles/"
                  # specify where you wish the outputs to be written
                  outputs_dir = "wdlExampleFiles/"
                  # specify the location of your cromwell jar
                  jar_loc = os.path.abspath("wdlExampleFiles/cromwell-35.jar")

                  job0 = Job.wrapJobFn(initialize_jobs)

                  wdl_filename = "hello.wdl"
                  wdl_file = toil.importFile("file://" + os.path.abspath(os.path.join(inputs_dir, wdl_filename)))

                  # add list of yml config inputs here or import and construct from file
                  json_files = ["hello1.json", "hello2.json", "hello3.json"]
                  i = 0
                  for json in json_files:
                      i = i + 1
                      json_file = toil.importFile("file://" + os.path.join(inputs_dir, json))
                      json_filename = json
                      job = Job.wrapJobFn(runQC, wdl_file, wdl_filename, json_file, json_filename, outputs_dir, jar_loc, output_num=str(i))
                      job0.addChild(job)

                  toil.start(job0)

   WDL Specifications
       WDL language specifications can be found here: https://github.com/broadinstitute/wdl/blob/develop/SPEC.md

       Implementing support for more features is currently underway, but a basic roadmap so far is:

       CURRENTLY IMPLEMENTED:

              • Scatter

              • Many Built-In Functions

              • Docker Calls

              • Handles Priority, and Output File Wrangling

              • Currently Handles Primitives and Arrays

       TO BE IMPLEMENTED:

              • Integrate Cloud Autoscaling Capacity More Robustly

              • WDL Files That "Import" Other WDL Files (Including URI Handling for 'http://' and 'https://')

DEVELOPING A WORKFLOW

       This  tutorial  walks  through  the  features  of Toil necessary for developing a workflow using the Toil
       Python API.

       NOTE:
          "script" and "workflow" will be used interchangeably

   Scripting Quick Start
       To begin, consider this short toil script which illustrates defining a workflow:

          from toil.common import Toil
          from toil.job import Job

          def helloWorld(message, memory="2G", cores=2, disk="3G"):
              return "Hello, world!, here's a message: %s" % message

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "OFF"
              options.clean = "always"

              hello_job = Job.wrapFn(helloWorld, "Woot")

              with Toil(options) as toil:
                  print(toil.start(hello_job)) #Prints Hello, world!, ...

       The workflow consists of a single job. The resource requirements for that job are (optionally)  specified
       by     keyword     arguments     (memory,     cores,     disk).     The     script     is    run    using
       toil.job.Job.Runner.getDefaultOptions(). Below we explain the components of this code in detail.

   Job Basics
       The atomic unit of work in a Toil workflow is a Job.  User scripts inherit from this base class to define
       units  of work. For example, here is a more long-winded class-based version of the job in the quick start
       example:

          from toil.job import Job

          class HelloWorld(Job):
              def __init__(self, message):
                  Job.__init__(self,  memory="2G", cores=2, disk="3G")
                  self.message = message

              def run(self, fileStore):
                  return "Hello, world!, here's a message: %s" % self.message

       In the example a class, HelloWorld, is defined. The constructor requests 2 gigabytes of memory,  2  cores
       and 3 gigabytes of local disk to complete the work.

       The  toil.job.Job.run()  method  is the function the user overrides to get work done. Here it just logs a
       message using toil.job.Job.log(), which will be registered in the log output of the leader process of the
       workflow.

   Invoking a Workflow
       We  can  add to the previous example to turn it into a complete workflow by adding the necessary function
       calls to create an instance of HelloWorld and to run this as a workflow containing  a  single  job.  This
       uses the toil.job.Job.Runner class, which is used to start and resume Toil workflows. For example:

          from toil.common import Toil
          from toil.job import Job

          class HelloWorld(Job):
              def __init__(self, message):
                  Job.__init__(self,  memory="2G", cores=2, disk="3G")
                  self.message = message

              def run(self, fileStore):
                  return "Hello, world!, here's a message: %s" % self.message

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "OFF"
              options.clean = "always"

              hello_job = HelloWorld("Woot")

              with Toil(options) as toil:
                  print(toil.start(hello_job))

       NOTE:
          Do  not include a . in the name of your python script (besides .py at the end).  This is to allow toil
          to import the types and  functions defined in your file while starting a new process.

       Alternatively, the more powerful toil.common.Toil class can be used to run and resume  workflows.  It  is
       used  as  a context manager and allows for preliminary setup, such as staging of files into the job store
       on the leader node. An instance of the class is initialized by specifying an options object.  The  actual
       workflow  is  then  invoked  by  calling the toil.common.Toil.start() method, passing the root job of the
       workflow, or, if a workflow is being restarted, toil.common.Toil.restart() should be used. Note that  the
       context  manager  should  have  explicit  if  else branches addressing restart and non restart cases. The
       boolean value for these if else blocks is toil.options.restart.

       For example:

          from toil.job import Job
          from toil.common import Toil

          class HelloWorld(Job):
              def __init__(self, message):
                  Job.__init__(self,  memory="2G", cores=2, disk="3G")
                  self.message = message

              def run(self, fileStore):
                  self.log("Hello, world!, I have a message: {}".format(self.message))

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              with Toil(options) as toil:
                  if not toil.options.restart:
                      job = HelloWorld("Woot!")
                      toil.start(job)
                  else:
                      toil.restart()

       The call to toil.job.Job.Runner.getDefaultOptions() creates a set of default options  for  the  workflow.
       The only argument is a description of how to store the workflow's state in what we call a job-store. Here
       the job-store is contained in a directory within the current working directory called  "toilWorkflowRun".
       Alternatively  this  string  can encode other ways to store the necessary state, e.g. an S3 bucket object
       store location. By default the job-store is deleted if the workflow completes successfully.

       The workflow is executed in the final line, which creates an instance of HelloWorld  and  runs  it  as  a
       workflow.  Note  all  Toil  workflows  start from a single starting job, referred to as the root job. The
       return value of the root job is returned as the result of the completed workflow (see promises  below  to
       see how this is a useful feature!).

   Specifying Commandline Arguments
       To      allow      command     line     control     of     the     options     we     can     use     the
       toil.job.Job.Runner.getDefaultArgumentParser() method to create a  argparse.ArgumentParser  object  which
       can be used to parse command line options for a Toil script. For example:

          from toil.common import Toil
          from toil.job import Job

          class HelloWorld(Job):
              def __init__(self, message):
                  Job.__init__(self,  memory="2G", cores=2, disk="3G")
                  self.message = message

              def run(self, fileStore):
                  return "Hello, world!, here's a message: %s" % self.message

          if __name__=="__main__":
              parser = Job.Runner.getDefaultArgumentParser()
              options = parser.parse_args()
              options.logLevel = "OFF"
              options.clean = "always"

              hello_job = HelloWorld("Woot")

              with Toil(options) as toil:
                  print(toil.start(hello_job))

       Creates  a fully fledged script with all the options Toil exposed as command line arguments. Running this
       script with "--help" will print the full list of options.

       Alternatively an existing argparse.ArgumentParser or optparse.OptionParser object can  have  Toil  script
       command line options added to it with the toil.job.Job.Runner.addToilOptions() method.

   Resuming a Workflow
       In  the  event  that a workflow fails, either because of programmatic error within the jobs being run, or
       because of node failure, the workflow can be resumed.  Workflows can only not be reliably resumed if  the
       job-store itself becomes corrupt.

       Critical  to  resumption  is that jobs can be rerun, even if they have apparently completed successfully.
       Put succinctly, a user defined job should not corrupt its input arguments. That way, regardless of  node,
       network or leader failure the job can be restarted and the workflow resumed.

       To   resume   a   workflow   specify   the   "restart"   option   in   the   options   object  passed  to
       toil.common.Toil.start(). If node failures are expected  it  can  also  be  useful  to  use  the  integer
       "retryCount" option, which will attempt to rerun a job retryCount number of times before marking it fully
       failed.

       In the common scenario that a small subset of jobs fail (including retry attempts) within a workflow Toil
       will  continue  to  run  other jobs until it can do no more, at which point toil.common.Toil.start() will
       raise a toil.leader.FailedJobsException exception. Typically at this point the user can decide to fix the
       script and resume the workflow or delete the job-store manually and rerun the complete workflow.

   Functions and Job Functions
       Defining jobs by creating class definitions generally involves the boilerplate of creating a constructor.
       To avoid this  the  classes  toil.job.FunctionWrappingJob  and  toil.job.JobFunctionWrappingTarget  allow
       functions to be directly converted to jobs. For example, the quick start example (repeated here):

          from toil.common import Toil
          from toil.job import Job

          def helloWorld(message, memory="2G", cores=2, disk="3G"):
              return "Hello, world!, here's a message: %s" % message

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "OFF"
              options.clean = "always"

              hello_job = Job.wrapFn(helloWorld, "Woot")

              with Toil(options) as toil:
                  print(toil.start(hello_job)) #Prints Hello, world!, ...

       Is equivalent to the previous example, but using a function to define the job.

       The function call:

          Job.wrapFn(helloWorld, "Woot")

       Creates the instance of the toil.job.FunctionWrappingTarget that wraps the function.

       The  keyword arguments memory, cores and disk allow resource requirements to be specified as before. Even
       if they are not included as keyword arguments within a function header they can be  passed  as  arguments
       when wrapping a function as a job and will be used to specify resource requirements.

       We  can  also  use  the  function wrapping syntax to a job function, a function whose first argument is a
       reference to the wrapping job. Just like a self argument in a class, this allows access to the methods of
       the wrapping job, see toil.job.JobFunctionWrappingTarget. For example:

          from toil.common import Toil
          from toil.job import Job

          def helloWorld(job, message):
              job.log("Hello world, I have a message: {}".format(message))

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              hello_job = Job.wrapJobFn(helloWorld, "Woot!")

              with Toil(options) as toil:
                  toil.start(hello_job)

       Here helloWorld() is a job function. It uses the toil.job.Job.log() to log a message that will be printed
       to the output console. Here the only subtle difference to note is the line:

          hello_job = Job.wrapJobFn(helloWorld, "Woot")

       Which  uses   the   function   toil.job.Job.wrapJobFn()   to   wrap   the   job   function   instead   of
       toil.job.Job.wrapFn() which wraps a vanilla function.

   Workflows with Multiple Jobs
       A  parent job can have child jobs and follow-on jobs. These relationships are specified by methods of the
       job class, e.g. toil.job.Job.addChild() and toil.job.Job.addFollowOn().

       Considering a set of jobs the nodes in a job graph and the child and follow-on relationships the directed
       edges  of the graph, we say that a job B that is on a directed path of child/follow-on edges from a job A
       in the job graph is a successor of A, similarly A is a predecessor of B.

       A parent job's child jobs are run directly after the parent job  has  completed,  and  in  parallel.  The
       follow-on  jobs  of a job are run after its child jobs and their successors have completed. They are also
       run in parallel. Follow-ons allow the easy specification of cleanup tasks that  happen  after  a  set  of
       parallel  child  tasks.  The  following  shows  a  simple  example that uses the earlier helloWorld() job
       function:

          from toil.common import Toil
          from toil.job import Job

          def helloWorld(job, message, memory="2G", cores=2, disk="3G"):
              job.log("Hello world, I have a message: {}".format(message))

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              j1 = Job.wrapJobFn(helloWorld, "first")
              j2 = Job.wrapJobFn(helloWorld, "second or third")
              j3 = Job.wrapJobFn(helloWorld, "second or third")
              j4 = Job.wrapJobFn(helloWorld, "last")
              j1.addChild(j2)
              j1.addChild(j3)
              j1.addFollowOn(j4)

              with Toil(options) as toil:
                  toil.start(j1)

       In the example four jobs are created, first j1 is run, then j2 and j3 are run in parallel as children  of
       j1, finally j4 is run as a follow-on of j1.

       There are multiple short hand functions to achieve the same workflow, for example:

          from toil.common import Toil
          from toil.job import Job

          def helloWorld(job, message, memory="2G", cores=2, disk="3G"):
              job.log("Hello world, I have a message: {}".format(message))

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              j1 = Job.wrapJobFn(helloWorld, "first")
              j2 = j1.addChildJobFn(helloWorld, "second or third")
              j3 = j1.addChildJobFn(helloWorld, "second or third")
              j4 = j1.addFollowOnJobFn(helloWorld, "last")

              with Toil(options) as toil:
                  toil.start(j1)

       Equivalently    defines    the   workflow,   where   the   functions   toil.job.Job.addChildJobFn()   and
       toil.job.Job.addFollowOnJobFn() are used to create job functions as children or follow-ons of an  earlier
       job.

       Jobs  graphs  are  not limited to trees, and can express arbitrary directed acyclic graphs. For a precise
       definition of legal graphs see toil.job.Job.checkJobGraphForDeadlocks(). The previous  example  could  be
       specified as a DAG as follows:

          from toil.common import Toil
          from toil.job import Job

          def helloWorld(job, message, memory="2G", cores=2, disk="3G"):
              job.log("Hello world, I have a message: {}".format(message))

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              j1 = Job.wrapJobFn(helloWorld, "first")
              j2 = j1.addChildJobFn(helloWorld, "second or third")
              j3 = j1.addChildJobFn(helloWorld, "second or third")
              j4 = j2.addChildJobFn(helloWorld, "last")
              j3.addChild(j4)

              with Toil(options) as toil:
                  toil.start(j1)

       Note the use of an extra child edge to make j4 a child of both j2 and j3.

   Dynamic Job Creation
       The  previous  examples show a workflow being defined outside of a job. However, Toil also allows jobs to
       be created dynamically within jobs. For example:

          from toil.common import Toil
          from toil.job import Job

          def binaryStringFn(job, depth, message=""):
              if depth > 0:
                  job.addChildJobFn(binaryStringFn, depth-1, message + "0")
                  job.addChildJobFn(binaryStringFn, depth-1, message + "1")
              else:
                  job.log("Binary string: {}".format(message))

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              with Toil(options) as toil:
                  toil.start(Job.wrapJobFn(binaryStringFn, depth=5))

       The job function binaryStringFn logs all possible binary strings of length n (here n=5), creating a total
       of  2^(n+2)  -  1 jobs dynamically and recursively. Static and dynamic creation of jobs can be mixed in a
       Toil workflow, with jobs defined within a job or job function being created at run time.

   Promises
       The previous example of dynamic job creation shows variables from a parent job being passed  to  a  child
       job.  Such  forward  variable  passing  is  naturally specified by recursive invocation of successor jobs
       within parent jobs. This can also be achieved statically by  passing  around  references  to  the  return
       variables of jobs. In Toil this is achieved with promises, as illustrated in the following example:

          from toil.common import Toil
          from toil.job import Job

          def fn(job, i):
              job.log("i is: %s" % i, level=100)
              return i+1

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              j1 = Job.wrapJobFn(fn, 1)
              j2 = j1.addChildJobFn(fn, j1.rv())
              j3 = j1.addFollowOnJobFn(fn, j2.rv())

              with Toil(options) as toil:
                  toil.start(j1)

       Running  this  workflow results in three log messages from the jobs: i is 1 from j1, i is 2 from j2 and i
       is 3 from j3.

       The return value from the first job is promised to the second job by the call to toil.job.Job.rv() in the
       following line:

          j2 = j1.addChildFn(fn, j1.rv())

       The  value  of  j1.rv() is a promise, rather than the actual return value of the function, because j1 for
       the given input has at that point not been evaluated.  A  promise  (toil.job.Promise)  is  essentially  a
       pointer  to  for the return value that is replaced by the actual return value once it has been evaluated.
       Therefore, when j2 is run the promise becomes 2.

       Promises also support indexing of return values:

          def parent(job):
              indexable = Job.wrapJobFn(fn)
              job.addChild(indexable)
              job.addFollowOnFn(raiseWrap, indexable.rv(2))

          def raiseWrap(arg):
              raise RuntimeError(arg) # raises "2"

          def fn(job):
              return (0, 1, 2, 3)

       Promises can be quite useful. For example, we can combine dynamic job creation with promises to achieve a
       job creation process that mimics the functional patterns possible in many programming languages:

          from toil.common import Toil
          from toil.job import Job

          def binaryStrings(job, depth, message=""):
              if depth > 0:
                  s = [ job.addChildJobFn(binaryStrings, depth-1, message + "0").rv(),
                        job.addChildJobFn(binaryStrings, depth-1, message + "1").rv() ]
                  return job.addFollowOnFn(merge, s).rv()
              return [message]

          def merge(strings):
              return strings[0] + strings[1]

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.loglevel = "OFF"
              options.clean = "always"

              with Toil(options) as toil:
                  print(toil.start(Job.wrapJobFn(binaryStrings, depth=5)))

       The  return  value  l of the workflow is a list of all binary strings of length 10, computed recursively.
       Although a toy example, it  demonstrates  how  closely  Toil  workflows  can  mimic  typical  programming
       patterns.

   Promised Requirements
       Promised  requirements  are  a  special  case  of  Promises that allow a job's return value to be used as
       another job's resource requirements.

       This is useful when, for example, a job's storage requirement is determined by a file staged to  the  job
       store by an earlier job:

          from toil.common import Toil
          from toil.job import Job, PromisedRequirement
          import os

          def parentJob(job):
              downloadJob = Job.wrapJobFn(stageFn, "File://"+os.path.realpath(__file__), cores=0.1, memory='32M', disk='1M')
              job.addChild(downloadJob)

              analysis = Job.wrapJobFn(analysisJob, fileStoreID=downloadJob.rv(0),
                                       disk=PromisedRequirement(downloadJob.rv(1)))
              job.addFollowOn(analysis)

          def stageFn(job, url, cores=1):
              importedFile = job.fileStore.importFile(url)
              return importedFile, importedFile.size

          def analysisJob(job, fileStoreID, cores=2):
              # now do some analysis on the file
              pass

          if __name__ == "__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              with Toil(options) as toil:
                  toil.start(Job.wrapJobFn(parentJob))

       Note  that  this  also  makes use of the size attribute of the FileID object.  This promised requirements
       mechanism can also be used in combination with an aggregator for multiple jobs' output values:

          def parentJob(job):
              aggregator = []
              for fileNum in range(0,10):
                  downloadJob = Job.wrapJobFn(stageFn, "File://"+os.path.realpath(__file__), cores=0.1, memory='32M', disk='1M')
                  job.addChild(downloadJob)
                  aggregator.append(downloadJob)

              analysis = Job.wrapJobFn(analysisJob, fileStoreID=downloadJob.rv(0),
                                       disk=PromisedRequirement(lambda xs: sum(xs), [j.rv(1) for j in aggregator]))
              job.addFollowOn(analysis)

          Limitations

                 Just like regular promises, the return value must be determined prior  to  scheduling  any  job
                 that  depends  on  the  return  value. In our example above, notice how the dependent jobs were
                 follow ons to the parent while promising jobs are children of the parent. This ordering ensures
                 that all promises are properly fulfilled.

   FileID
       The  toil.fileStore.FileID  class  is a small wrapper around Python's builtin string class. It is used to
       represent a file's ID in the file store, and has a size attribute that is the file's size in bytes.  This
       object is returned by importFile and writeGlobalFile.

   Managing files within a workflow
       It  is  frequently  the  case  that  a workflow will want to create files, both persistent and temporary,
       during its run. The toil.fileStores.abstractFileStore.AbstractFileStore class is used by jobs  to  manage
       these files in a manner that guarantees cleanup and resumption on failure.

       The  toil.job.Job.run() method has a file store instance as an argument.  The following example shows how
       this can be used to create temporary files that persist for the  length  of  the  job,  be  placed  in  a
       specified  local  disk  of  the  node  and  that  will be cleaned up, regardless of failure, when the job
       finishes:

          from toil.common import Toil
          from toil.job import Job

          class LocalFileStoreJob(Job):
              def run(self, fileStore):
                  # self.TempDir will always contain the name of a directory within the allocated disk space reserved for the job
                  scratchDir = self.tempDir

                  # Similarly create a temporary file.
                  scratchFile = fileStore.getLocalTempFile()

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              # Create an instance of FooJob which will have at least 2 gigabytes of storage space.
              j = LocalFileStoreJob(disk="2G")

              #Run the workflow
              with Toil(options) as toil:
                  toil.start(j)

       Job functions can also access the file store for the job. The equivalent of the  LocalFileStoreJob  class
       is

          def localFileStoreJobFn(job):
              scratchDir = job.tempDir
              scratchFile = job.fileStore.getLocalTempFile()

       Note that the fileStore attribute is accessed as an attribute of the job argument.

       In  addition  to temporary files that exist for the duration of a job, the file store allows the creation
       of files in a global store, which persists during the workflow and are  globally  accessible  (hence  the
       name) between jobs. For example:

          from toil.common import Toil
          from toil.job import Job
          import os
          import sys

          def globalFileStoreJobFn(job):
              job.log("The following example exercises all the methods provided"
                      " by the toil.fileStores.abstractFileStore.AbstractFileStore class")

              # Create a local temporary file.
              scratchFile = job.fileStore.getLocalTempFile()

              # Write something in the scratch file.
              with open(scratchFile, 'w') as fH:
                  fH.write("What a tangled web we weave")

              # Write a copy of the file into the file-store; fileID is the key that can be used to retrieve the file.
              # This write is asynchronous by default
              fileID = job.fileStore.writeGlobalFile(scratchFile)

              # Write another file using a stream; fileID2 is the
              # key for this second file.
              with job.fileStore.writeGlobalFileStream(cleanup=True) as (fH, fileID2):
                  if sys.version_info >= (3, 0):
                      # if python 3
                      fH.write(b"Out brief candle")
                  else:
                      # if python 2
                      fH.write("Out brief candle")

              # Now read the first file; scratchFile2 is a local copy of the file that is read-only by default.
              scratchFile2 = job.fileStore.readGlobalFile(fileID)

              # Read the second file to a desired location: scratchFile3.
              scratchFile3 = os.path.join(job.tempDir, "foo.txt")
              job.fileStore.readGlobalFile(fileID2, userPath=scratchFile3)

              # Read the second file again using a stream.
              with job.fileStore.readGlobalFileStream(fileID2) as fH:
                  print(fH.read()) #This prints "Out brief candle"

              # Delete the first file from the global file-store.
              job.fileStore.deleteGlobalFile(fileID)

              # It is unnecessary to delete the file keyed by fileID2 because we used the cleanup flag,
              # which removes the file after this job and all its successors have run (if the file still exists)

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              with Toil(options) as toil:
                  toil.start(Job.wrapJobFn(globalFileStoreJobFn))

       The  example  demonstrates  the global read, write and delete functionality of the file-store, using both
       local copies of the files and streams to read and write the files. It covers all the methods provided  by
       the file store interface.

       What  is  obvious  is  that the file-store provides no functionality to update an existing "global" file,
       meaning that files are, barring deletion, immutable.  Also worth noting is that there is no  file  system
       hierarchy  for  files  in  the  global  file  store.  These limitations allow us to fairly easily support
       different object stores and to use caching to limit the amount of network file transfer between jobs.

   Staging of Files into the Job Store
       External files can be imported into or exported out of the job store prior to running a workflow when the
       toil.common.Toil   context  manager  is  used  on  the  leader.  The  context  manager  provides  methods
       toil.common.Toil.importFile(), and toil.common.Toil.exportFile() for this purpose.  The  destination  and
       source locations of such files are described with URLs passed to the two methods. A list of the currently
       supported URLs can be found at toil.jobStores.abstractJobStore.AbstractJobStore.importFile().  To  import
       an  external file into the job store as a shared file, pass the optional sharedFileName parameter to that
       method.

       If a workflow fails for any reason an imported file acts as any other file  in  the  job  store.  If  the
       workflow  was configured such that it not be cleaned up on a failed run, the file will persist in the job
       store and needs not be staged again when the workflow is resumed.

       Example:

          import os
          from toil.common import Toil
          from toil.job import Job

          class HelloWorld(Job):
              def __init__(self, id):
                  Job.__init__(self,  memory="2G", cores=2, disk="3G")
                  self.inputFileID = id

              def run(self, fileStore):
                  with self.fileStore.readGlobalFileStream(self.inputFileID) as fi:
                      with self.fileStore.writeGlobalFileStream() as (fo, outputFileID):
                          fo.write(fi.read() + 'World!')
                  return outputFileID

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              with Toil(options) as toil:
                  if not toil.options.restart:
                      ioFileDirectory = os.path.join(os.path.dirname(os.path.abspath(__file__)), "stagingExampleFiles")
                      inputFileID = toil.importFile("file://" + os.path.abspath(os.path.join(ioFileDirectory, "in.txt")))
                      outputFileID = toil.start(HelloWorld(inputFileID))
                  else:
                      outputFileID = toil.restart()

                  toil.exportFile(outputFileID, "file://" + os.path.abspath(os.path.join(ioFileDirectory, "out.txt")))

   Using Docker Containers in Toil
       Docker containers are commonly used with Toil. The combination of Toil and Docker allows for pipelines to
       be  fully  portable  between  any platform that has both Toil and Docker installed. Docker eliminates the
       need for the user to do any other tool installation or environment setup.

       In order to use Docker containers with Toil, Docker must be installed on  all  workers  of  the  cluster.
       Instructions for installing Docker can be found on the Docker website.

       When  using Toil-based autoscaling, Docker will be automatically set up on the cluster's worker nodes, so
       no additional installation steps are necessary.  Further information on using Toil-based autoscaling  can
       be found in the Autoscaling documentation.

       In order to use docker containers in a Toil workflow, the container can be built locally or downloaded in
       real time from an online docker repository like Quay. If the  container  is  not  in  a  repository,  the
       container's layers must be accessible on each node of the cluster.

       When  invoking  docker  containers  from  within a Toil workflow, it is strongly recommended that you use
       dockerCall(), a toil job function provided in toil.lib.docker. dockerCall leverages docker's  own  python
       API,  and provides container cleanup on job failure. When docker containers are run without this feature,
       failed jobs can result in resource leaks.  Docker's API can be found at docker-py.

       In order to use dockerCall, your installation of Docker must be set up to run without sudo.  Instructions
       for setting this up can be found here.

       An example of a basic dockerCall is below:

          dockerCall(job=job,
                      tool='quay.io/ucsc_cgl/bwa',
                      workDir=job.tempDir,
                      parameters=['index', '/data/reference.fa'])

       Note  the  assumption that reference.fa file is located in /data. This is Toil's standard convention as a
       mount location to reduce boilerplate when calling dockerCall.  Users can choose their own mount locations
       by  supplying  a  volumes  kwarg  to dockerCall, such as: volumes={working_dir: {'bind': '/data', 'mode':
       'rw'}}, where working_dir is an absolute path on the user's filesystem.

       dockerCall can also be added to workflows like any other job function:

          from toil.common import Toil
          from toil.job import Job
          from toil.lib.docker import apiDockerCall
          import os

          align = Job.wrapJobFn(apiDockerCall,
                                image='ubuntu',
                                working_dir=os.getcwd(),
                                parameters=['ls', '-lha'])

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              with Toil(options) as toil:
                 toil.start(align)

       cgl-docker-lib contains dockerCall-compatible Dockerized tools that are commonly used  in  bioinformatics
       analysis.

       The  documentation  provides  guidelines  for developing your own Docker containers that can be used with
       Toil and dockerCall. In order for a  container  to  be  compatible  with  dockerCall,  it  must  have  an
       ENTRYPOINT  set to a wrapper script, as described in cgl-docker-lib containerization standards.  This can
       be set by passing in the optional keyword argument, 'entrypoint'.  Example:
          entrypoint=["/bin/bash","-c"]

       dockerCall supports currently the 75 keyword arguments found in the python Docker API,  under  the  'run'
       command.

   Services
       It  is  sometimes  desirable to run services, such as a database or server, concurrently with a workflow.
       The toil.job.Job.Service class provides a simple mechanism for spawning such  a  service  within  a  Toil
       workflow,  allowing  precise  specification of the start and end time of the service, and providing start
       and end methods to  use  for  initialization  and  cleanup.  The  following  simple,  conceptual  example
       illustrates how services work:

          from toil.common import Toil
          from toil.job import Job

          class DemoService(Job.Service):

              def start(self, fileStore):
                  # Start up a database/service here
                  # Return a value that enables another process to connect to the database
                  return "loginCredentials"

              def check(self):
                  # A function that if it returns False causes the service to quit
                  # If it raises an exception the service is killed and an error is reported
                  return True

              def stop(self, fileStore):
                  # Cleanup the database here
                  pass

          j = Job()
          s = DemoService()
          loginCredentialsPromise = j.addService(s)

          def dbFn(loginCredentials):
              # Use the login credentials returned from the service's start method to connect to the service
              pass

          j.addChildFn(dbFn, loginCredentialsPromise)

          if __name__=="__main__":
              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              with Toil(options) as toil:
                  toil.start(j)

       In this example the DemoService starts a database in the start method, returning an object from the start
       method indicating how a client job would access the database. The service's stop  method  cleans  up  the
       database, while the service's check method is polled periodically to check the service is alive.

       A DemoService instance is added as a service of the root job j, with resource requirements specified. The
       return value from toil.job.Job.addService() is a promise to the  return  value  of  the  service's  start
       method.  When  the promised is fulfilled it will represent how to connect to the database. The promise is
       passed to a child job of j, which uses it to make a database  connection.  The  services  of  a  job  are
       started  before  any of its successors have been run and stopped after all the successors of the job have
       completed successfully.

       Multiple services can be created per  job,  all  run  in  parallel.  Additionally,  services  can  define
       sub-services  using  toil.job.Job.Service.addChild().   This  allows  complex  networks of services to be
       created, e.g. Apache Spark clusters, within a workflow.

   Checkpoints
       Services complicate resuming a workflow after failure,  because  they  can  create  complex  dependencies
       between  jobs. For example, consider a service that provides a database that multiple jobs update. If the
       database service fails and loses state, it is not clear that just restarting the service will  allow  the
       workflow  to  be  resumed,  because jobs that created that state may have already finished. To get around
       this problem Toil supports checkpoint jobs, specified as the boolean keyword argument checkpoint to a job
       or wrapped function, e.g.:

          j = Job(checkpoint=True)

       A  checkpoint job is rerun if one or more of its successors fails its retry attempts, until it itself has
       exhausted its retry attempts. Upon restarting a checkpoint job all  its  existing  successors  are  first
       deleted,  and  then  the  job  is  rerun  to define new successors. By checkpointing a job that defines a
       service, upon failure of the service the database and the jobs that access the service can  be  redefined
       and rerun.

       To  make  the  implementation  of  checkpoint  jobs  simple, a job can only be a checkpoint if when first
       defined it has no successors, i.e. it can only define successors within its run method.

   Encapsulation
       Let A be a root job potentially with children and follow-ons. Without an encapsulated  job  the  simplest
       way  to specify a job B which runs after A and all its successors is to create a parent of A, call it Ap,
       and then make B a follow-on of Ap. e.g.:

          from toil.common import Toil
          from toil.job import Job

          if __name__=="__main__":
              # A is a job with children and follow-ons, for example:
              A = Job()
              A.addChild(Job())
              A.addFollowOn(Job())

              # B is a job which needs to run after A and its successors
              B = Job()

              # The way to do this without encapsulation is to make a parent of A, Ap, and make B a follow-on of Ap.
              Ap = Job()
              Ap.addChild(A)
              Ap.addFollowOn(B)

              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              with Toil(options) as toil:
                  print(toil.start(Ap))

       An encapsulated job E(A) of A saves making Ap, instead we can write:

          from toil.common import Toil
          from toil.job import Job

          if __name__=="__main__":
              # A
              A = Job()
              A.addChild(Job())
              A.addFollowOn(Job())

              # Encapsulate A
              A = A.encapsulate()

              # B is a job which needs to run after A and its successors
              B = Job()

              # With encapsulation A and its successor subgraph appear to be a single job, hence:
              A.addChild(B)

              options = Job.Runner.getDefaultOptions("./toilWorkflowRun")
              options.logLevel = "INFO"
              options.clean = "always"

              with Toil(options) as toil:
                  print(toil.start(A))

       Note the call to toil.job.Job.encapsulate() creates the toil.job.Job.EncapsulatedJob.

   Depending on Toil
       If you are packing your workflow(s) as a pip-installable distribution on PyPI, you might  be  tempted  to
       declare  Toil  as  a  dependency  in your setup.py, via the install_requires keyword argument to setup().
       Unfortunately, this does not work, for two reasons: For one, Toil uses  Setuptools'  extra  mechanism  to
       manage  its own optional dependencies. If you explicitly declared a dependency on Toil, you would have to
       hard-code a particular combination of extras (or no extras at all), robbing the user of the  choice  what
       Toil extras to install. Secondly, and more importantly, declaring a dependency on Toil would only lead to
       Toil being installed on the leader node of a cluster, but not the worker nodes. Auto-deployment does  not
       work  here  because  Toil  cannot  auto-deploy  itself,  the  classic "Which came first, chicken or egg?"
       problem.

       In other words, you shouldn't explicitly depend on Toil. Document the dependency  instead  (as  in  "This
       workflow  needs Toil version X.Y.Z to be installed") and optionally add a version check to your setup.py.
       Refer to the check_version() function in the toil-lib project's setup.py for an  example.  Alternatively,
       you can also just depend on toil-lib and you'll get that check for free.

       If your workflow depends on a dependency of Toil, consider not making that dependency explicit either. If
       you do, you risk a version conflict between your project and Toil. The pip utility  may  silently  ignore
       that  conflict,  breaking  either Toil or your workflow. It is safest to simply assume that Toil installs
       that dependency for you. The only downside is that  you  are  locked  into  the  exact  version  of  that
       dependency  that  Toil  declares.  But  such  is  life  with  Python, which, unlike Java, has no means of
       dependencies belonging to different software components  within  the  same  process,  and  whose  favored
       software  distribution  utility is incapable of properly resolving overlapping dependencies and detecting
       conflicts.

   Best Practices for Dockerizing Toil Workflows
       Computational Genomics Lab's Dockstore based production system provides workflow authors  a  way  to  run
       Dockerized  versions  of  their  pipeline  in  an automated, scalable fashion. To be compatible with this
       system of a workflow should meet the following requirements. In  addition  to  the  Docker  container,  a
       common workflow language descriptor file is needed. For inputs:

       • Only  command  line  arguments should be used for configuring the workflow. If the workflow relies on a
         configuration file, like Toil-RNAseq or ProTECT, a wrapper script inside the Docker  container  can  be
         used to parse the CLI and generate the necessary configuration file.

       • All  inputs  to  the pipeline should be explicitly enumerated rather than implicit.  For example, don't
         rely on one FASTQ read's path to discover the location of its pair. This is necessary since all  inputs
         are mapped to their own isolated directories when the Docker is called via Dockstore.

       • All  inputs  must  be  documented in the CWL descriptor file. Examples of this file can be seen in both
         Toil-RNAseq and ProTECT.

       For outputs:

       • All outputs should be written to a local path rather than S3.

       • Take care to package outputs in a local and user-friendly way. For example, don't tar up all output  if
         there are specific files that will care to see individually.

       • All  output  file names should be deterministic and predictable. For example, don't prepend the name of
         an output file with PASS/FAIL depending on the outcome of the pipeline.

       • All outputs must be documented in the CWL descriptor file. Examples of this file can be  seen  in  both
         Toil-RNAseq and ProTECT.

TOIL CLASS API

       The Toil class configures and starts a Toil run.

       class toil.common.Toil(options)
              A  context  manager that represents a Toil workflow, specifically the batch system, job store, and
              its configuration.

              __init__(options)
                     Initialize a Toil object from the given options. Note that this is  very  light-weight  and
                     that the bulk of the work is done when the context is entered.

                     Parameters
                            options (argparse.Namespace) -- command line options specified by the user

              config = None

                     Type   toil.common.Config

              start(rootJob)
                     Invoke  a Toil workflow with the given job as the root for an initial run. This method must
                     be called in the body of a with Toil(...) as toil: statement. This  method  should  not  be
                     called more than once for a workflow that has not finished.

                     Parameters
                            rootJob (toil.job.Job) -- The root job of the workflow

                     Returns
                            The root job's return value

              restart()
                     Restarts a workflow that has been interrupted.

                     Returns
                            The root job's return value

              classmethod getJobStore(locator)
                     Create an instance of the concrete job store implementation that matches the given locator.

                     Parameters
                            locator (str) -- The location of the job store to be represent by the instance

                     Returns
                            an instance of a concrete subclass of AbstractJobStore

                     Return type
                            toil.jobStores.abstractJobStore.AbstractJobStore

              static createBatchSystem(config)
                     Creates an instance of the batch system specified in the given config.

                     Parameters
                            config (toil.common.Config) -- the current configuration

                     Return type
                            batchSystems.abstractBatchSystem.AbstractBatchSystem

                     Returns
                            an instance of a concrete subclass of AbstractBatchSystem

              importFile(srcUrl, sharedFileName=None)
                     Imports the file at the given URL into job store.

                     See toil.jobStores.abstractJobStore.AbstractJobStore.importFile() for a full description

              exportFile(jobStoreFileID, dstUrl)
                     Exports file to destination pointed at by the destination URL.

                     See toil.jobStores.abstractJobStore.AbstractJobStore.exportFile() for a full description

              static getWorkflowDir(workflowID, configWorkDir=None)
                     Returns  a path to the directory where worker directories and the cache will be located for
                     this workflow.

                     ParametersworkflowID (str) -- Unique identifier for the workflow

                            • configWorkDir (str) -- Value passed to the program using the --workDir flag

                     Returns
                            Path to the workflow directory

                     Return type
                            str

              writePIDFile()
                     Write a the pid of this process to a file in the jobstore.

                     Overwriting the current contents of pid.log is a feature, not a bug of this method.   Other
                     methods  will  rely  on  always  having the most current pid available.  So far there is no
                     reason to store any old pids.

JOB STORE API

       The job store interface is an abstraction layer that that hides the specific details of file storage, for
       example  standard  file  systems, S3, etc. The AbstractJobStore API is implemented to support a give file
       store, e.g. S3. Implement this API to support a new file store.

       class toil.jobStores.abstractJobStore.AbstractJobStore
              Represents the physical storage for the jobs and files in a Toil workflow.

              __init__()
                     Create an instance of the job store. The instance will not be fully functional until either
                     initialize()  or  resume() is invoked. Note that the destroy() method may be invoked on the
                     object with or without prior invocation of either of these two methods.

              initialize(config)
                     Create the physical storage for this job store, allocate a  workflow  ID  and  persist  the
                     given Toil configuration to the store.

                     Parameters
                            config  (toil.common.Config)  -- the Toil configuration to initialize this job store
                            with. The given configuration will be updated with the newly allocated workflow ID.

                     Raises JobStoreExistsException -- if the physical storage for this job store already exists

              writeConfig()
                     Persists the value of the AbstractJobStore.config attribute to the job store,  so  that  it
                     can be retrieved later by other instances of this class.

              resume()
                     Connect this instance to the physical storage it represents and load the Toil configuration
                     into the AbstractJobStore.config attribute.

                     Raises NoSuchJobStoreException -- if the physical storage for this job store doesn't exist

              config The Toil configuration associated with this job store.

                     Return type
                            toil.common.Config

              setRootJob(rootJobStoreID)
                     Set the root job of the workflow backed by this job store

                     Parameters
                            rootJobStoreID (str) -- The ID of the job to set as root

              loadRootJob()
                     Loads the root job in the current job store.

                     Raises toil.job.JobException -- If no root job is set or if the root job doesn't  exist  in
                            this job store

                     Returns
                            The root job.

                     Return type
                            toil.jobGraph.JobGraph

              createRootJob(*args, **kwargs)
                     Create a new job and set it as the root job in this job store

                     Return type
                            toil.jobGraph.JobGraph

              getRootJobReturnValue()
                     Parse the return value from the root job.

                     Raises an exception if the root job hasn't fulfilled its promise yet.

              importFile(srcUrl, sharedFileName=None, hardlink=False)
                     Imports  the  file  at  the  given URL into job store. The ID of the newly imported file is
                     returned. If the name of a shared file name is provided, the file will be imported as  such
                     and None is returned.

                     Currently supported schemes are:

                        •

                          's3' for objects in Amazon S3
                                 e.g. s3://bucket/key

                        •

                          'file' for local files
                                 e.g. file:///local/file/path'http' e.g. http://someurl.com/path'gs'   e.g. gs://bucket/file

                     ParameterssrcUrl  (str) -- URL that points to a file or object in the storage mechanism of a
                              supported URL scheme e.g. a blob in an AWS s3 bucket.

                            • sharedFileName (str) -- Optional name to assign to the imported  file  within  the
                              job store

                     Returns
                            The jobStoreFileId of the imported file or None if sharedFileName was given

                     Return type
                            toil.fileStores.FileID or None

              exportFile(jobStoreFileID, dstUrl)
                     Exports file to destination pointed at by the destination URL.

                     Refer to AbstractJobStore.importFile() documentation for currently supported URL schemes.

                     Note  that  the  helper  method  _exportFile  is  used to read from the source and write to
                     destination. To implement any optimizations that circumvent this,  the  _exportFile  method
                     should be overridden by subclasses of AbstractJobStore.

                     ParametersjobStoreFileID  (str)  --  The  id  of  the  file  in the job store that should be
                              exported.

                            • dstUrl (str) -- URL that points to a file or object in the storage mechanism of  a
                              supported URL scheme e.g. a blob in an AWS s3 bucket.

              classmethod getSize(url)
                     returns the size in bytes of the file at the given URL

                     Parameters
                            url  (urlparse.ParseResult)  --  URL  that points to a file or object in the storage
                            mechanism of a supported URL scheme e.g. a blob in an AWS s3 bucket.

              destroy()
                     The inverse of initialize(), this method deletes the physical storage represented  by  this
                     instance.  While  not  being  atomic,  this  method  is  at least idempotent, as a means to
                     counteract potential issues with eventual consistency exhibited by the  underlying  storage
                     mechanisms.  This  means that if the method fails (raises an exception), it may (and should
                     be) invoked again. If the underlying storage mechanism is  eventually  consistent,  even  a
                     successful  invocation  is  not  an  ironclad  guarantee that the physical storage vanished
                     completely and immediately. A successful invocation only guarantees that the deletion  will
                     eventually  happen. It is therefore recommended to not immediately reuse the same job store
                     location for a new Toil workflow.

              getEnv()
                     Returns a dictionary of environment variables that this job store requires  to  be  set  in
                     order to function properly on a worker.

                     Return type
                            dict[str,str]

              clean(jobCache=None)
                     Function  to  cleanup the state of a job store after a restart.  Fixes jobs that might have
                     been partially updated. Resets the try counts and removes jobs that are not  successors  of
                     the current root job.

                     Parameters
                            jobCache (dict[str,toil.jobGraph.JobGraph]) -- if a value it must be a dict from job
                            ID keys to JobGraph object values. Jobs will be loaded from the cache (which can  be
                            downloaded from the job store in a batch) instead of piecemeal when recursed into.

              batch()
                     All  calls  to create() with this context manager active will be performed in a batch after
                     the context manager is released.

                     Return type
                            None

              create(jobNode)
                     Creates a job graph from the given job node & writes it to the job store.

                     Return type
                            toil.jobGraph.JobGraph

              exists(jobStoreID)
                     Indicates whether the job with the specified jobStoreID exists in the job store

                     Return type
                            bool

              getPublicUrl(fileName)
                     Returns a publicly accessible URL to the given file in the job store. The returned URL  may
                     expire  as  early  as  1h  after its been returned. Throw an exception if the file does not
                     exist.

                     Parameters
                            fileName (str) -- the jobStoreFileID of the file to generate a URL for

                     Raises NoSuchFileException -- if the specified file does not exist in this job store

                     Return type
                            str

              getSharedPublicUrl(sharedFileName)
                     Differs from getPublicUrl() in that this method is for generating  URLs  for  shared  files
                     written by writeSharedFileStream().

                     Returns  a  publicly  accessible  URL  to the given file in the job store. The returned URL
                     starts with 'http:',  'https:' or 'file:'. The returned URL may expire as early as 1h after
                     its been returned. Throw an exception if the file does not exist.

                     Parameters
                            sharedFileName  (str)  --  The  name  of  the  shared  file to generate a publically
                            accessible url for.

                     Raises NoSuchFileException -- raised if the specified file does not exist in the store

                     Return type
                            str

              load(jobStoreID)
                     Loads the job referenced by the given ID and returns it.

                     Parameters
                            jobStoreID (str) -- the ID of the job to load

                     Raises NoSuchJobException -- if there is no job with the given ID

                     Return type
                            toil.jobGraph.JobGraph

              update(job)
                     Persists the job in this store atomically.

                     Parameters
                            job (toil.jobGraph.JobGraph) -- the job to write to this job store

              delete(jobStoreID)
                     Removes from store atomically, can not then subsequently call  load(),  write(),  update(),
                     etc. with the job.

                     This operation is idempotent, i.e. deleting a job twice or deleting a non-existent job will
                     succeed silently.

                     Parameters
                            jobStoreID (str) -- the ID of the job to delete from this job store

              jobs() Best effort attempt to return iterator on all jobs in  the  store.  The  iterator  may  not
                     return  all jobs and may also contain orphaned jobs that have already finished successfully
                     and should not be rerun. To guarantee you get any and all jobs  that  can  be  run  instead
                     construct a more expensive ToilState object

                     Returns
                            Returns  iterator on jobs in the store. The iterator may or may not contain all jobs
                            and may contain invalid jobs

                     Return type
                            Iterator[toil.jobGraph.JobGraph]

              writeFile(localFilePath, jobStoreID=None, cleanup=False)
                     Takes a file (as a path) and places it in this job store. Returns an ID that can be used to
                     retrieve  the  file  at a later time.  The file is written in a atomic manner.  It will not
                     appear in the jobStore until the write has successfully completed.

                     ParameterslocalFilePath (str) -- the path to the local file that will be uploaded to the job
                              store.

                            • jobStoreID  (str) -- the id of a job, or None. If specified, the may be associated
                              with that job in a job-store-specific way. This may influence the returned ID.

                            • cleanup (bool) -- Whether to attempt  to  delete  the  file  when  the  job  whose
                              jobStoreID  was  given  as  jobStoreID  is  deleted  with jobStore.delete(job). If
                              jobStoreID was not given, does nothing.

                     RaisesConcurrentFileModificationException --  if  the  file  was  modified  concurrently
                              during an invocation of this method

                            • NoSuchJobException -- if the job specified via jobStoreID does not exist

                     FIXME: some implementations may not raise this

                     Returns
                            an  ID  referencing  the  newly created file and can be used to read the file in the
                            future.

                     Return type
                            str

              writeFileStream(jobStoreID=None, cleanup=False)
                     Similar to writeFile, but returns a context manager yielding a tuple of 1)  a  file  handle
                     which  can  be written to and 2) the ID of the resulting file in the job store. The yielded
                     file handle does not need to and should not be closed explicitly.  The file is written in a
                     atomic  manner.   It  will  not  appear  in  the  jobStore until the write has successfully
                     completed.

                     ParametersjobStoreID (str) -- the id of a job, or None. If specified, the may be  associated
                              with that job in a job-store-specific way. This may influence the returned ID.

                            • cleanup  (bool)  --  Whether  to  attempt  to  delete  the file when the job whose
                              jobStoreID was given  as  jobStoreID  is  deleted  with  jobStore.delete(job).  If
                              jobStoreID was not given, does nothing.

                     RaisesConcurrentFileModificationException  --  if  the  file  was  modified concurrently
                              during an invocation of this method

                            • NoSuchJobException -- if the job specified via jobStoreID does not exist

                     FIXME: some implementations may not raise this

                     Returns
                            an ID that references the newly created file and can be used to read the file in the
                            future.

                     Return type
                            str

              getEmptyFileStoreID(jobStoreID=None, cleanup=False)
                     Creates   an   empty   file   in   the   job   store   and   returns   its   ID.   Call  to
                     fileExists(getEmptyFileStoreID(jobStoreID)) will return True.

                     ParametersjobStoreID (str) -- the id of a job, or None. If specified, the may be  associated
                              with that job in a job-store-specific way. This may influence the returned ID.

                            • cleanup  (bool)  --  Whether  to  attempt  to  delete  the file when the job whose
                              jobStoreID was given  as  jobStoreID  is  deleted  with  jobStore.delete(job).  If
                              jobStoreID was not given, does nothing.

                     Returns
                            a jobStoreFileID that references the newly created file and can be used to reference
                            the file in the future.

                     Return type
                            str

              readFile(jobStoreFileID, localFilePath, symlink=False)
                     Copies or hard links the file referenced by jobStoreFileID to the given  local  file  path.
                     The  version will be consistent with the last copy of the file written/updated. If the file
                     in  the  job  store  is  later  modified  via  updateFile  or   updateFileStream,   it   is
                     implementation-defined  whether those writes will be visible at localFilePath.  The file is
                     copied in an atomic manner.  It will not appear in the local file system until the copy has
                     completed.

                     The file at the given local path may not be modified after this method returns!

                     ParametersjobStoreFileID (str) -- ID of the file to be copied

                            • localFilePath  (str)  --  the local path indicating where to place the contents of
                              the given file in the job store

                            • symlink (bool) -- whether the reader can tolerate a symlink. If set to  true,  the
                              job store may create a symlink instead of a full copy of the file or a hard link.

              readFileStream(jobStoreFileID)
                     Similar to readFile, but returns a context manager yielding a file handle which can be read
                     from. The yielded file handle does not need to and should not be closed explicitly.

                     Parameters
                            jobStoreFileID (str) -- ID of the file to get a readable file handle for

              deleteFile(jobStoreFileID)
                     Deletes the file with the given ID from this job store. This operation is idempotent,  i.e.
                     deleting a file twice or deleting a non-existent file will succeed silently.

                     Parameters
                            jobStoreFileID (str) -- ID of the file to delete

              fileExists(jobStoreFileID)
                     Determine whether a file exists in this job store.

                     Parameters
                            jobStoreFileID (str) -- an ID referencing the file to be checked

                     Return type
                            bool

              getFileSize(jobStoreFileID)
                     Get the size of the given file in bytes, or 0 if it does not exist when queried.

                     Note  that  job  stores which encrypt files might return overestimates of file sizes, since
                     the encrypted  file  may  have  been  padded  to  the  nearest  block,  augmented  with  an
                     initialization vector, etc.

                     Parameters
                            jobStoreFileID (str) -- an ID referencing the file to be checked

                     Return type
                            int

              updateFile(jobStoreFileID, localFilePath)
                     Replaces  the  existing version of a file in the job store. Throws an exception if the file
                     does not exist.

                     ParametersjobStoreFileID (str) -- the ID of the file in the job store to be updated

                            • localFilePath (str) -- the local path to a file that will  overwrite  the  current
                              version in the job store

                     RaisesConcurrentFileModificationException  --  if  the  file  was  modified concurrently
                              during an invocation of this method

                            • NoSuchFileException -- if the specified file does not exist

              updateFileStream(jobStoreFileID)
                     Replaces the existing version of a file in the job store. Similar to writeFile, but returns
                     a  context  manager yielding a file handle which can be written to. The yielded file handle
                     does not need to and should not be closed explicitly.

                     Parameters
                            jobStoreFileID (str) -- the ID of the file in the job store to be updated

                     RaisesConcurrentFileModificationException --  if  the  file  was  modified  concurrently
                              during an invocation of this method

                            • NoSuchFileException -- if the specified file does not exist

              writeSharedFileStream(sharedFileName, isProtected=None)
                     Returns  a context manager yielding a writable file handle to the global file referenced by
                     the given name.  File will be created in an atomic manner.

                     ParameterssharedFileName (str)  --  A  file  name  matching  AbstractJobStore.fileNameRegex,
                              unique within this job store

                            • isProtected  (bool)  --  True  if  the  file  must be encrypted, None if it may be
                              encrypted or False if it must be stored in the clear.

                     Raises ConcurrentFileModificationException -- if the file was modified concurrently  during
                            an invocation of this method

              readSharedFileStream(sharedFileName)
                     Returns  a context manager yielding a readable file handle to the global file referenced by
                     the given name.

                     Parameters
                            sharedFileName (str) -- A file name matching AbstractJobStore.fileNameRegex,  unique
                            within this job store

              writeStatsAndLogging(statsAndLoggingString)
                     Adds the given statistics/logging string to the store of statistics info.

                     Parameters
                            statsAndLoggingString (str) -- the string to be written to the stats file

                     Raises ConcurrentFileModificationException  -- if the file was modified concurrently during
                            an invocation of this method

              readStatsAndLogging(callback, readAll=False)
                     Reads stats/logging strings accumulated by  the  writeStatsAndLogging()  method.  For  each
                     stats/logging  string  this method calls the given callback function with an open, readable
                     file handle from which the stats string can be read. Returns the  number  of  stats/logging
                     strings  processed.  Each  stats/logging  string  is only processed once unless the readAll
                     parameter is set, in which case the  given  callback  will  be  invoked  for  all  existing
                     stats/logging strings, including the ones from a previous invocation of this method.

                     Parameterscallback  (Callable) -- a function to be applied to each of the stats file handles
                              found

                            • readAll (bool) -- a boolean indicating whether to read the already processed stats
                              files in addition to the unread stats files

                     Raises ConcurrentFileModificationException  -- if the file was modified concurrently during
                            an invocation of this method

                     Returns
                            the number of stats files processed

                     Return type
                            int

TOIL JOB API

       Functions to wrap jobs and return values (promises).

   FunctionWrappingJob
       The subclass of Job for wrapping user functions.

       class toil.job.FunctionWrappingJob(userFunction, *args, **kwargs)
              Job used to wrap a function. In its run method the wrapped function is called.

              __init__(userFunction, *args, **kwargs)

                     Parameters
                            userFunction (callable) -- The function to wrap. It will be called  with  *args  and
                            **kwargs as arguments.

                     The keywords memory, cores, disk, preemptable and checkpoint are reserved keyword arguments
                     that if specified will be used  to  determine  the  resources  required  for  the  job,  as
                     toil.job.Job.__init__().  If  they  are  keyword  arguments  to  the  function they will be
                     extracted from the function definition, but may be overridden by the  user  (as  you  would
                     expect).

              run(fileStore)
                     Override this function to perform work and dynamically create successor jobs.

                     Parameters
                            fileStore  (toil.fileStores.abstractFileStore.AbstractFileStore)  --  Used to create
                            local and globally sharable temporary files and to send log messages to  the  leader
                            process.

                     Returns
                            The  return  value  of  the  function  can  be  passed  to  other  jobs  by means of
                            toil.job.Job.rv().

   JobFunctionWrappingJob
       The subclass of FunctionWrappingJob for wrapping user job functions.

       class toil.job.JobFunctionWrappingJob(userFunction, *args, **kwargs)
              A job function is a function whose first argument is a Job instance that is the wrapping  job  for
              the  function.  This  can  be  used  to  add  successor  jobs for the function and perform all the
              functions the Job class provides.

              To      enable      the      job       function       to       get       access       to       the
              toil.fileStores.abstractFileStore.AbstractFileStore  instance (see toil.job.Job.run()), it is made
              a variable of the wrapping job called fileStore.

              To specify a job's resource requirements the following default keyword arguments can be specified:

                 • memory

                 • disk

                 • cores

              For example to wrap a function into a job we would call:

                 Job.wrapJobFn(myJob, memory='100k', disk='1M', cores=0.1)

              run(fileStore)
                     Override this function to perform work and dynamically create successor jobs.

                     Parameters
                            fileStore (toil.fileStores.abstractFileStore.AbstractFileStore) --  Used  to  create
                            local  and  globally sharable temporary files and to send log messages to the leader
                            process.

                     Returns
                            The return value  of  the  function  can  be  passed  to  other  jobs  by  means  of
                            toil.job.Job.rv().

   EncapsulatedJob
       The subclass of Job for encapsulating a job, allowing a subgraph of jobs to be treated as a single job.

       class toil.job.EncapsulatedJob(job)
              A convenience Job class used to make a job subgraph appear to be a single job.

              Let  A be the root job of a job subgraph and B be another job we'd like to run after A and all its
              successors have completed, for this use encapsulate:

                 #  Job A and subgraph, Job B
                 A, B = A(), B()
                 A' = A.encapsulate()
                 A'.addChild(B)
                 #  B will run after A and all its successors have completed, A and its subgraph of
                 # successors in effect appear to be just one job.

              If the job being encapsulated has predecessors (e.g. is not the root job), then  the  encapsulated
              job will inherit these predecessors. If predecessors are added to the job being encapsulated after
              the encapsulated job is created then the encapsulating job will  NOT  inherit  these  predecessors
              automatically.  Care  should  be  exercised  to  ensure the encapsulated job has the proper set of
              predecessors.

              The return value of an encapsulatd job (as accessed by  the  toil.job.Job.rv()  function)  is  the
              return  value  of  the root job, e.g. A().encapsulate().rv() and A().rv() will resolve to the same
              value after A or A.encapsulate() has been run.

              __init__(job)

                     Parameters
                            job (toil.job.Job) -- the job to encapsulate.

              addChild(childJob)
                     Adds childJob to be run as child of this job. Child jobs will be run         directly after
                     this job's toil.job.Job.run() method has completed.

                     Parameters
                            childJob (toil.job.Job) --

                     Returns
                            childJob

                     Return type
                            toil.job.Job

              addService(service, parentService=None)
                     Add a service.

                     The  toil.job.Job.Service.start() method of the service will be called after the run method
                     has completed but before any successors are run.  The service's toil.job.Job.Service.stop()
                     method will be called once the successors of the job have been run.

                     Services  allow  things  like databases and servers to be started and accessed by jobs in a
                     workflow.

                     Raises toil.job.JobException -- If service has already been made the  child  of  a  job  or
                            another service.

                     Parametersservice (toil.job.Job.Service) -- Service to add.

                            • parentService  (toil.job.Job.Service)  --  Service  that  will  be  started before
                              'service' is started. Allows trees of services to  be  established.  parentService
                              must be a service of this job.

                     Returns
                            a    promise    that    will    be    replaced    with   the   return   value   from
                            toil.job.Job.Service.start() of service in any successor of the job.

                     Return type
                            toil.job.Promise

              addFollowOn(followOnJob)
                     Adds a follow-on job, follow-on jobs will be run after the  child  jobs  and          their
                     successors have been run.

                     Parameters
                            followOnJob (toil.job.Job) --

                     Returns
                            followOnJob

                     Return type
                            toil.job.Job

              rv(*path)
                     Creates  a  promise (toil.job.Promise) representing a return value of the job's run method,
                     or, in case of a function-wrapping job, the wrapped function's return value.

                     Parameters
                            path ((Any)) -- Optional path for selecting  a  component  of  the  promised  return
                            value.   If  absent  or  empty, the entire return value will be used. Otherwise, the
                            first element of the path is used to select an individual item of the return  value.
                            For  that  to work, the return value must be a list, dictionary or of any other type
                            implementing the __getitem__() magic method. If the selected  item  is  yet  another
                            composite  value,  the second element of the path can be used to select an item from
                            it, and so on. For example, if the return value is [6,{'a':42}], .rv(0) would select
                            6  ,  rv(1)  would  select {'a':3} while rv(1,'a') would select 3. To select a slice
                            from a return value that is slicable, e.g. tuple or list, the path element should be
                            a  slice  object.  For  example, assuming that the return value is [6, 7, 8, 9] then
                            .rv(slice(1, 3)) would select [7, 8]. Note that slicing really only makes  sense  at
                            the end of path.

                     Returns
                            A promise representing the return value of this jobs toil.job.Job.run() method.

                     Return type
                            toil.job.Promise

              prepareForPromiseRegistration(jobStore)
                     Ensure  that  a  promise  by  this job (the promissor) can register with the promissor when
                     another job referring to the promise (the promissee) is  being  serialized.  The  promissee
                     holds  the  reference to the promise (usually as part of the the job arguments) and when it
                     is being pickled, so will the promises it refers to. Pickling a promise triggers it  to  be
                     registered with the promissor.

                     Returns

   Promise
       The class used to reference return values of jobs/services not yet run/started.

       class toil.job.Promise(job, path)
              References  a  return  value from a toil.job.Job.run() or toil.job.Job.Service.start() method as a
              promise before the method itself is run.

              Let T be a job. Instances of Promise (termed a promise) are returned by T.rv(), which is  used  to
              reference  the return value of T's run function. When the promise is passed to the constructor (or
              as an argument to a wrapped function) of a different, successor job the promise will  be  replaced
              by  the  actual  referenced return value. This mechanism allows a return values from one job's run
              method to be input argument to job before the former job's run function has been executed.

              filesToDelete = {}
                     A set of IDs of files containing promised values when we know we won't need them anymore

              __init__(job, path)

                     Parametersjob (Job) -- the job whose return value this promise references

                            • path -- see Job.rv()

       class toil.job.PromisedRequirement(valueOrCallable, *args)

              __init__(valueOrCallable, *args)
                     Class  for  dynamically   allocating   job   function   resource   requirements   involving
                     toil.job.Promise instances.

                     Use  when  resource  requirements  depend  on  the  return  value  of  a  parent  function.
                     PromisedRequirements can be modified by passing a function that takes the Promise as input.

                     For example, let f, g, and h be functions. Then a Toil workflow can be defined as follows::
                     A = Job.wrapFn(f) B = A.addChildFn(g, cores=PromisedRequirement(A.rv()) C = B.addChildFn(h,
                     cores=PromisedRequirement(lambda x: 2*x, B.rv()))

                     ParametersvalueOrCallable -- A single Promise instance or a function  that  takes  *args  as
                              input parameters.

                            • *args (int or Promise) -- variable length argument list

              getValue()
                     Returns PromisedRequirement value

              static convertPromises(kwargs)
                     Returns  True  if  reserved  resource keyword is a Promise or PromisedRequirement instance.
                     Converts Promise instance to PromisedRequirement.

                     Parameters
                            kwargs -- function keyword arguments

                     Returns
                            bool

JOB METHODS API

       Jobs are the units of work in Toil which are composed into workflows.

       class toil.job.Job(memory=None, cores=None, disk=None, preemptable=None, unitName=None, checkpoint=False,
       displayName=None)
              Class represents a unit of work in toil.

              __init__(memory=None,  cores=None,  disk=None,  preemptable=None, unitName=None, checkpoint=False,
              displayName=None)
                     This method must be called by any overriding constructor.

                     Parametersmemory (int or string convertible by toil.lib.humanize.human2bytes to an  int)  --
                              the maximum number of bytes of memory the job will require to run.

                            • cores  (int  or  string convertible by toil.lib.humanize.human2bytes to an int) --
                              the number of CPU cores required.

                            • disk (int or string convertible by toil.lib.humanize.human2bytes to an int) -- the
                              amount of local disk space required by the job, expressed in bytes.

                            • preemptable (bool) -- if the job can be run on a preemptable node.

                            • checkpoint -- if any of this job's successor jobs completely fails, exhausting all
                              their retries, remove any successor  jobs  and  rerun  this  job  to  restart  the
                              subtree.  Job  must  be a leaf vertex in the job graph when initially defined, see
                              toil.job.Job.checkNewCheckpointsAreCutVertices().

              run(fileStore)
                     Override this function to perform work and dynamically create successor jobs.

                     Parameters
                            fileStore (toil.fileStores.abstractFileStore.AbstractFileStore) --  Used  to  create
                            local  and  globally sharable temporary files and to send log messages to the leader
                            process.

                     Returns
                            The return value  of  the  function  can  be  passed  to  other  jobs  by  means  of
                            toil.job.Job.rv().

              addChild(childJob)
                     Adds childJob to be run as child of this job. Child jobs will be run         directly after
                     this job's toil.job.Job.run() method has completed.

                     Parameters
                            childJob (toil.job.Job) --

                     Returns
                            childJob

                     Return type
                            toil.job.Job

              hasChild(childJob)
                     Check if childJob is already a child of this job.

                     Parameters
                            childJob (toil.job.Job) --

                     Returns
                            True if childJob is a child of the job, else False.

                     Return type
                            bool

              addFollowOn(followOnJob)
                     Adds a follow-on job, follow-on jobs will be run after the  child  jobs  and          their
                     successors have been run.

                     Parameters
                            followOnJob (toil.job.Job) --

                     Returns
                            followOnJob

                     Return type
                            toil.job.Job

              hasFollowOn(followOnJob)
                     Check if given job is already a follow-on of this job.

                     Parameters
                            followOnJob (toil.job.Job) --

                     Returns
                            True if the followOnJob is a follow-on of this job, else False.

                     Return type
                            bool

              addService(service, parentService=None)
                     Add a service.

                     The  toil.job.Job.Service.start() method of the service will be called after the run method
                     has completed but before any successors are run.  The service's toil.job.Job.Service.stop()
                     method will be called once the successors of the job have been run.

                     Services  allow  things  like databases and servers to be started and accessed by jobs in a
                     workflow.

                     Raises toil.job.JobException -- If service has already been made the  child  of  a  job  or
                            another service.

                     Parametersservice (toil.job.Job.Service) -- Service to add.

                            • parentService  (toil.job.Job.Service)  --  Service  that  will  be  started before
                              'service' is started. Allows trees of services to  be  established.  parentService
                              must be a service of this job.

                     Returns
                            a    promise    that    will    be    replaced    with   the   return   value   from
                            toil.job.Job.Service.start() of service in any successor of the job.

                     Return type
                            toil.job.Promise

              addChildFn(fn, *args, **kwargs)
                     Adds a function as a child job.

                     Parameters
                            fn -- Function to be run as a child job with *args and **kwargs as         arguments
                            to  this  function.  See  toil.job.FunctionWrappingJob  for reserved         keyword
                            arguments used to specify resource requirements.

                     Returns
                            The new child job that wraps fn.

                     Return type
                            toil.job.FunctionWrappingJob

              addFollowOnFn(fn, *args, **kwargs)
                     Adds a function as a follow-on job.

                     Parameters
                            fn  --  Function  to  be  run  as  a  follow-on  job  with  *args  and  **kwargs  as
                            arguments   to   this   function.   See  toil.job.FunctionWrappingJob  for  reserved
                            keyword arguments used to specify resource requirements.

                     Returns
                            The new follow-on job that wraps fn.

                     Return type
                            toil.job.FunctionWrappingJob

              addChildJobFn(fn, *args, **kwargs)
                     Adds a job function as a child job. See toil.job.JobFunctionWrappingJob for a definition of
                     a job function.

                     Parameters
                            fn  --  Job  function  to  be  run  as  a  child  job  with  *args  and  **kwargs as
                            arguments  to  this  function.  See  toil.job.JobFunctionWrappingJob  for   reserved
                            keyword arguments used to specify resource requirements.

                     Returns
                            The new child job that wraps fn.

                     Return type
                            toil.job.JobFunctionWrappingJob

              addFollowOnJobFn(fn, *args, **kwargs)
                     Add a follow-on job function. See toil.job.JobFunctionWrappingJob for a definition of a job
                     function.

                     Parameters
                            fn -- Job function to be  run  as  a  follow-on  job  with  *args  and  **kwargs  as
                            arguments   to  this  function.  See  toil.job.JobFunctionWrappingJob  for  reserved
                            keyword arguments used to specify resource requirements.

                     Returns
                            The new follow-on job that wraps fn.

                     Return type
                            toil.job.JobFunctionWrappingJob

              tempDir
                     Shortcut to calling job.fileStore.getLocalTempDir(). Temp dir is created on first call  and
                     will   be   returned   for   first   and   future  calls  :return:  Path  to  tempDir.  See
                     job.fileStore.getLocalTempDir :rtype: str

              log(text, level=20)
                     convenience wrapper for fileStore.logToMaster()

              static wrapFn(fn, *args, **kwargs)
                     Makes  a  Job  out  of  a  function.          Convenience  function  for   constructor   of
                     toil.job.FunctionWrappingJob.

                     Parameters
                            fn  --  Function  to  be  run  with  *args  and  **kwargs  as arguments.         See
                            toil.job.JobFunctionWrappingJob  for  reserved  keyword  arguments  used          to
                            specify resource requirements.

                     Returns
                            The new function that wraps fn.

                     Return type
                            toil.job.FunctionWrappingJob

              static wrapJobFn(fn, *args, **kwargs)
                     Makes  a  Job  out  of  a  job  function.          Convenience  function for constructor of
                     toil.job.JobFunctionWrappingJob.

                     Parameters
                            fn -- Job function to be run with  *args  and  **kwargs  as  arguments.          See
                            toil.job.JobFunctionWrappingJob  for  reserved  keyword  arguments  used          to
                            specify resource requirements.

                     Returns
                            The new job function that wraps fn.

                     Return type
                            toil.job.JobFunctionWrappingJob

              encapsulate()
                     Encapsulates the job, see toil.job.EncapsulatedJob.  Convenience function  for  constructor
                     of toil.job.EncapsulatedJob.

                     Returns
                            an encapsulated version of this job.

                     Return type
                            toil.job.EncapsulatedJob

              rv(*path)
                     Creates  a  promise (toil.job.Promise) representing a return value of the job's run method,
                     or, in case of a function-wrapping job, the wrapped function's return value.

                     Parameters
                            path ((Any)) -- Optional path for selecting  a  component  of  the  promised  return
                            value.   If  absent  or  empty, the entire return value will be used. Otherwise, the
                            first element of the path is used to select an individual item of the return  value.
                            For  that  to work, the return value must be a list, dictionary or of any other type
                            implementing the __getitem__() magic method. If the selected  item  is  yet  another
                            composite  value,  the second element of the path can be used to select an item from
                            it, and so on. For example, if the return value is [6,{'a':42}], .rv(0) would select
                            6  ,  rv(1)  would  select {'a':3} while rv(1,'a') would select 3. To select a slice
                            from a return value that is slicable, e.g. tuple or list, the path element should be
                            a  slice  object.  For  example, assuming that the return value is [6, 7, 8, 9] then
                            .rv(slice(1, 3)) would select [7, 8]. Note that slicing really only makes  sense  at
                            the end of path.

                     Returns
                            A promise representing the return value of this jobs toil.job.Job.run() method.

                     Return type
                            toil.job.Promise

              prepareForPromiseRegistration(jobStore)
                     Ensure  that  a  promise  by  this job (the promissor) can register with the promissor when
                     another job referring to the promise (the promissee) is  being  serialized.  The  promissee
                     holds  the  reference to the promise (usually as part of the the job arguments) and when it
                     is being pickled, so will the promises it refers to. Pickling a promise triggers it  to  be
                     registered with the promissor.

                     Returns

              checkJobGraphForDeadlocks()
                     See    toil.job.Job.checkJobGraphConnected(),    toil.job.Job.checkJobGraphAcyclic()    and
                     toil.job.Job.checkNewCheckpointsAreLeafVertices() for more info.

                     Raises toil.job.JobGraphDeadlockException -- if the job graph is cyclic, contains  multiple
                            roots  or  contains  checkpoint  jobs  that  are not leaf vertices when defined (see
                            toil.job.Job.checkNewCheckpointsAreLeaves()).

              getRootJobs()

                     Returns
                            The roots of the connected component of jobs that contains this job.         A  root
                            is a job with no predecessors.

                     :rtype : set of toil.job.Job instances

              checkJobGraphConnected()

                     Raises toil.job.JobGraphDeadlockException -- if toil.job.Job.getRootJobs() does         not
                            contain exactly one root job.

                     As execution always starts from one root job, having multiple root jobs will          cause
                     a deadlock to occur.

              checkJobGraphAcylic()

                     Raises toil.job.JobGraphDeadlockException  --  if  the  connected component         of jobs
                            containing this job contains any cycles of  child/followOn  dependencies          in
                            the  augmented  job  graph (see below). Such cycles are not allowed         in valid
                            job graphs.

                     A follow-on edge (A, B) between two jobs A and B is equivalent         to  adding  a  child
                     edge to B from (1) A, (2) from each child of A,         and (3) from the successors of each
                     child of A. We call each such edge         an edge an "implied"  edge.  The  augmented  job
                     graph is a job graph including         all the implied edges.

                     For  a  job  graph  G  = (V, E) the algorithm is O(|V|^2). It is O(|V| + |E|) for         a
                     graph with no follow-ons. The former follow-on case could be improved!

              checkNewCheckpointsAreLeafVertices()
                     A checkpoint job is a job that is restarted if either it fails, or if  any  of          its
                     successors completely fails, exhausting their retries.

                     A job is a leaf it is has no successors.

                     A checkpoint job must be a leaf when initially added to the job graph. When its         run
                     method is invoked it can then  create  direct  successors.  This  restriction  is  made  to
                     simplify implementation.

                     Raises toil.job.JobGraphDeadlockException -- if there exists a job being added to the graph
                            for which         checkpoint=True and which is not a leaf.

              defer(function, *args, **kwargs)
                     Register a deferred function, i.e. a callable  that  will  be  invoked  after  the  current
                     attempt  at  running  this  job  concludes.  A job attempt is said to conclude when the job
                     function (or the  toil.job.Job.run()  method  for  class-based  jobs)  returns,  raises  an
                     exception  or  after the process running it terminates abnormally. A deferred function will
                     be called on the node that attempted to run the job, even if a subsequent attempt  is  made
                     on another node. A deferred function should be idempotent because it may be called multiple
                     times on the same node or even in the same process. More than one deferred function may  be
                     registered  per  job attempt by calling this method repeatedly with different arguments. If
                     the same function is registered twice with the same or  different  arguments,  it  will  be
                     called twice per job attempt.

                     Examples for deferred functions are ones that handle cleanup of resources external to Toil,
                     like Docker containers, files outside the work directory, etc.

                     Parametersfunction (callable) -- The function to be called after this job concludes.

                            • args (list) -- The arguments to the function

                            • kwargs (dict) -- The keyword arguments to the function

              getTopologicalOrderingOfJobs()

                     Returns
                            a list of jobs such that for all pairs of indices i, j for which i < j,          the
                            job at index i can be run before the job at index j.

                     Return type
                            list

JOB.RUNNER API

       The Runner contains the methods needed to configure and start a Toil run.

       class Job.Runner
              Used to setup and run Toil workflow.

              static getDefaultArgumentParser()
                     Get argument parser with added toil workflow options.

                     Returns
                            The argument parser used by a toil workflow with added Toil options.

                     Return type
                            argparse.ArgumentParser

              static getDefaultOptions(jobStore)
                     Get default options for a toil workflow.

                     Parameters
                            jobStore (string) -- A string describing the jobStore             for the workflow.

                     Returns
                            The options used by a toil workflow.

                     Return type
                            argparse.ArgumentParser values object

              static addToilOptions(parser)
                     Adds the default toil options to an optparse or argparse parser object.

                     Parameters
                            parser  (optparse.OptionParser  or argparse.ArgumentParser) -- Options object to add
                            toil options to.

              static startToil(job, options)
                     Deprecated by toil.common.Toil.start. Runs the toil workflow using the given  options  (see
                     Job.Runner.getDefaultOptions and Job.Runner.addToilOptions) starting with this job.  :param
                     toil.job.Job job: root job of the workflow :raises: toil.leader.FailedJobsException  if  at
                     the end of function             their remain failed jobs.  :return: The return value of the
                     root job's run function.  :rtype: Any

JOB.FILESTORE API

       The AbstractFileStore is an abstraction of a Toil run's shared storage.

       class     toil.fileStores.abstractFileStore.AbstractFileStore(jobStore,      jobGraph,      localTempDir,
       waitForPreviousCommit)
              Interface used to allow user code run by Toil to read and write files.

              Also provides the interface to other Toil facilities used by user code, including:

                 • normal (non-real-time) logging

                 • finding the correct temporary directory for scratch work

                 • importing and exporting files into and out of the workflow

              Stores user files in the jobStore, but keeps them separate from actual jobs.

              May implement caching.

              Passed as argument to the toil.job.Job.run() method.

              Access    to    files    is    only   permitted   inside   the   context   manager   provided   by
              toil.fileStores.abstractFileStore.AbstractFileStore.open().

              Also responsible for committing completed jobs back to the job store with an update operation, and
              allowing that commit operation to be waited for.

              __init__(jobStore, jobGraph, localTempDir, waitForPreviousCommit)
                     Create a new file store object.

                     ParametersjobStore  (toil.jobStores.abstractJobStore.AbstractJobStore)  --  the job store in
                              use for the current Toil run.

                            • jobGraph (toil.jobGraph.JobGraph) --  the  job  graph  object  for  the  currently
                              running job.

                            • localTempDir  (str)  --  the  per-worker  local  temporary  directory, under which
                              per-job directories will be created.

                            • waitForPreviousCommit -- the waitForCommit  method  of  the  previous  job's  file
                              store,  when jobs are running in sequence on the same worker. Used to prevent this
                              file store's startCommit and the previous job's startCommit methods  from  running
                              at  the same time and racing. If they did race, it might be possible for the later
                              job to be fully marked as completed in the job store before the eralier job was.

              static shutdownFileStore(workflowDir, workflowID)
                     Carry out any necessary filestore-specific cleanup.

                     This is a destructive operation and it is important to  ensure  that  there  are  no  other
                     running  processes  on  the  system  that  are  modifying  or using the file store for this
                     workflow.

                     This is the intended to be the last call to the file store in a Toil  run,  called  by  the
                     batch system cleanup function upon batch system shutdown.

                     ParametersworkflowDir (str) -- The path to the cache directory

                            • workflowID (str) -- The workflow ID for this invocation of the workflow

              open(job)
                     The  context  manager  used  to  conduct tasks prior-to, and after a job has been run. File
                     operations are only permitted inside the context manager.

                     Parameters
                            job (toil.job.Job) -- The job instance of the toil job to run.

              getLocalTempDir()
                     Get a new local temporary directory in which to write files that persist for  the  duration
                     of the job.

                     Returns
                            The  absolute path to a new local temporary directory. This directory will exist for
                            the duration of the job  only,  and  is  guaranteed  to  be  deleted  once  the  job
                            terminates, removing all files it contains recursively.

                     Return type
                            str

              getLocalTempFile()
                     Get a new local temporary file that will persist for the duration of the job.

                     Returns
                            The  absolute  path to a local temporary file. This file will exist for the duration
                            of the job only, and is guaranteed to be deleted once the job terminates.

                     Return type
                            str

              getLocalTempFileName()
                     Get a valid name for a new local file. Don't actually create a file at the path.

                     Returns
                            Path to valid file

                     Return type
                            str

              writeGlobalFile(localFileName, cleanup=False)
                     Takes a file (as a path) and uploads it to the job store.

                     ParameterslocalFileName (string) -- The path to the local file to upload.

                            • cleanup (bool) -- if True then the copy of the global file will  be  deleted  once
                              the  job  and  all  its successors have completed running.  If not the global file
                              must be deleted manually.

                     Returns
                            an ID that can be used to retrieve the file.

                     Return type
                            toil.fileStores.FileID

              writeGlobalFileStream(cleanup=False)
                     Similar to writeGlobalFile, but allows the writing of a  stream  to  the  job  store.   The
                     yielded file handle does not need to and should not be closed explicitly.

                     Parameters
                            cleanup             (bool)             --             is            as            in
                            toil.fileStores.abstractFileStore.AbstractFileStore.writeGlobalFile().

                     Returns
                            A context manager yielding a tuple of 1) a file handle which can be written  to  and
                            2) the toil.fileStores.FileID of the resulting file in the job store.

              readGlobalFile(fileStoreID, userPath=None, cache=True, mutable=False, symlink=False)
                     Makes  the  file  associated with fileStoreID available locally. If mutable is True, then a
                     copy of the file will be created locally so that the original is not modified and does  not
                     change  the  file  for  other  jobs. If mutable is False, then a link can be created to the
                     file, saving disk resources.

                     If a user path is specified, it is used as the destination. If a user path isn't specified,
                     the file is stored in the local temp directory with an encoded name.

                     The  destination  file  must  not  be  deleted  by the user; it can only be deleted through
                     deleteLocalFile.

                     Parametersor str fileStoreID (toil.fileStores.FileID) -- job store id for the file

                            • userPath (string) -- a path to the name of file to which the global file  will  be
                              copied or hard-linked (see below).

                            • cache (bool) -- Described in toil.fileStores.CachingFileStore.readGlobalFile()mutable (bool) -- Described in toil.fileStores.CachingFileStore.readGlobalFile()

                     Returns
                            An absolute path to a local, temporary copy of the file keyed by fileStoreID.

                     Return type
                            str

              readGlobalFileStream(fileStoreID)
                     Similar  to  readGlobalFile, but allows a stream to be read from the job store. The yielded
                     file handle does not need to and should not be closed explicitly.

                     Returns
                            a context manager yielding a file handle which can be read from.

              getGlobalFileSize(fileStoreID)
                     Get the size of the file pointed to by the given ID, in bytes.

                     If a FileID or something else with a non-None 'size' field, gets that.

                     Otherwise, asks the job store to poll the file's size.

                     Note that the job store may overestimate the file's size, for example if  it  is  encrypted
                     and had to be augmented with an IV or other encryption framing.

                     Parameters
                            or str fileStoreID (toil.fileStores.FileID) -- File ID for the file

                     Returns
                            File's size in bytes, as stored in the job store

                     Return type
                            int

              deleteLocalFile(fileStoreID)
                     Deletes local copies of files associated with the provided job store ID.

                     The files deleted are all those previously read from this file ID via readGlobalFile by the
                     current job into the job's file-store-provided temp  directory,  plus  the  file  that  was
                     written  to  create  the given file ID, if it was written by the current job from the job's
                     file-store-provided temp directory.

                     Parameters
                            or str fileStoreID (toil.fileStores.FileID) -- File Store  ID  of  the  file  to  be
                            deleted.

              deleteGlobalFile(fileStoreID)
                     Deletes  local  files with the provided job store ID and then permanently deletes them from
                     the job store. To ensure that the job can be restarted if necessary, the  delete  will  not
                     happen until after the job's run method has completed.

                     Parameters
                            or  str  fileStoreID (toil.fileStores.FileID) -- the File Store ID of the file to be
                            deleted.

              logToMaster(text, level=20)
                     Send a logging message to the leader. The message will also be         logged by the worker
                     at the same level.

                     Parameterstext -- The string to log.

                            • level (int) -- The logging level.

              startCommit(jobState=False)
                     Update the status of the job on the disk.

                     May start an asynchronous process. Call waitForCommit() to wait on that process.

                     Parameters
                            jobState  (bool)  --  If  True,  commit  the  state of the FileStore's job, and file
                            deletes. Otherwise, commit only file creates/updates.

              waitForCommit()
                     Blocks while startCommit is running. This function is called by  this  job's  successor  to
                     ensure  that  it  does  not begin modifying the job store until after this job has finished
                     doing so.

                     Might be called when startCommit is never called on a particular instance, in which case it
                     does not block.

                     Returns
                            Always returns True

                     Return type
                            bool

              classmethod shutdown(dir_)
                     Shutdown the filestore on this node.

                     This is intended to be called on batch system shutdown.

                     Parameters
                            dir -- The implementation-specific directory containing the required information for
                            shutting down the file store and removing all its  state  and  all  job  local  temp
                            directories from the node.

       class toil.fileStores.FileID(fileStoreID, size)
              A  small  wrapper around Python's builtin string class. It is used to represent a file's ID in the
              file store, and has a size attribute that is the file's size in bytes. This object is returned  by
              importFile and writeGlobalFile.

              Calls  into  the  file  store  can  use  bare  strings; size will be queried from the job store if
              unavailable in the ID.

              __init__(fileStoreID, size)
                     Initialize self.  See help(type(self)) for accurate signature.

              pack() Pack the FileID into a string so it can be passed through external code.

              classmethod unpack(packedFileStoreID)
                     Unpack the result of pack() into a FileID object.

BATCH SYSTEM API

       The batch system interface is used by Toil to abstract over different ways of running  batches  of  jobs,
       for     example     Slurm,     GridEngine,     Mesos,     Parasol     and     a    single    node.    The
       toil.batchSystems.abstractBatchSystem.AbstractBatchSystem API is implemented to run jobs  using  a  given
       job management system, e.g. Mesos.

   Batch System Enivronmental Variables
       Environmental variables allow passing of scheduler specific parameters.

       For SLURM:

          export TOIL_SLURM_ARGS="-t 1:00:00 -q fatq"

       For  TORQUE  there  are two environment variables - one for everything but the resource requirements, and
       another - for resources requirements (without the -l prefix):

          export TOIL_TORQUE_ARGS="-q fatq"
          export TOIL_TORQUE_REQS="walltime=1:00:00"

       For GridEngine (SGE, UGE),  there  is  an  additional  environmental  variable  to  define  the  parallel
       environment for running multicore jobs:

          export TOIL_GRIDENGINE_PE='smp'
          export TOIL_GRIDENGINE_ARGS='-q batch.q'

       For HTCondor, additional parameters can be included in the submit file passed to condor_submit:

          export TOIL_HTCONDOR_PARAMS='requirements = TARGET.has_sse4_2 == true; accounting_group = test'

       The environment variable is parsed as a semicolon-separated string of parameter = value pairs.

   Batch System API
       class toil.batchSystems.abstractBatchSystem.AbstractBatchSystem
              An  abstract  (as  far as Python currently allows) base class to represent the interface the batch
              system must provide to Toil.

              classmethod supportsAutoDeployment()
                     Whether this batch system supports auto-deployment of the user script itself. If  it  does,
                     the setUserScript() can be invoked to set the resource object representing the user script.

                     Note to implementors: If your implementation returns True here, it should also override

                     Return type
                            bool

              classmethod supportsWorkerCleanup()
                     Indicates  whether  this  batch  system  invokes  workerCleanup()  after the last job for a
                     particular workflow invocation finishes. Note that the term  worker  refers  to  an  entire
                     node,  not  just a worker process. A worker process may run more than one job sequentially,
                     and more than one concurrent worker process may exist  on  a  worker  node,  for  the  same
                     workflow. The batch system is said to shut down after the last worker process terminates.

                     Return type
                            bool

              setUserScript(userScript)
                     Set  the  user script for this workflow. This method must be called before the first job is
                     issued to this batch system, and only if supportsAutoDeployment() returns  True,  otherwise
                     it will raise an exception.

                     Parameters
                            userScript  (toil.resource.Resource)  --  the  resource object representing the user
                            script or module and the modules it depends on.

              issueBatchJob(jobNode)
                     Issues a job with the specified command to the batch system and returns a unique jobID.

                     :param jobNode a toil.job.JobNode

                     Returns
                            a unique jobID that can be used to reference the newly issued job

                     Return type
                            int

              killBatchJobs(jobIDs)
                     Kills the given job IDs. After returning, the killed jobs will not appear in the results of
                     getRunningBatchJobIDs.

                     Parameters
                            jobIDs (list[int]) -- list of IDs of jobs to kill

              getIssuedBatchJobIDs()
                     Gets all currently issued jobs

                     Returns
                            A list of jobs (as jobIDs) currently issued (may be running, or may be waiting to be
                            run). Despite the result being a list, the ordering should not be depended upon.

                     Return type
                            list[str]

              getRunningBatchJobIDs()
                     Gets a map of jobs as jobIDs that are currently running (not just  waiting)  and  how  long
                     they have been running, in seconds.

                     Returns
                            dictionary  with  currently  running  jobID keys and how many seconds they have been
                            running as the value

                     Return type
                            dict[int,float]

              getUpdatedBatchJob(maxWait)
                     Returns information about job that has updated its  status  (i.e.  ceased  running,  either
                     successfully or with an error). Each such job will be returned exactly once.

                     Parameters
                            maxWait (float) -- the number of seconds to block, waiting for a result

                     Return type
                            tuple(str, int, float) or None

                     Returns
                            If  a  result is available, returns a tuple (jobID, exitValue, wallTime).  Otherwise
                            it returns None. wallTime is the number of seconds (a strictly  positive  float)  in
                            wall-clock  time  the  job  ran  for,  or None if this batch system does not support
                            tracking wall time. Returns None for jobs that were killed.

              shutdown()
                     Called at the completion of  a  toil  invocation.   Should  cleanly  terminate  all  worker
                     threads.

              setEnv(name, value=None)
                     Set  an  environment  variable  for  the  worker  process before it is launched. The worker
                     process will typically inherit the environment of the machine it is  running  on  but  this
                     method  makes  it  possible  to  override  specific variables in that inherited environment
                     before the worker is launched. Note that this mechanism is different to the one used by the
                     worker  internally  to  set  up the environment of a job. A call to this method affects all
                     jobs issued after this method returns. Note to implementors:  This  means  that  you  would
                     typically need to copy the variables before enqueuing a job.

                     If no value is provided it will be looked up from the current environment.

              classmethod setOptions(setOption)
                     Process command line or configuration options relevant to this batch system.  The

                     Parameters
                            setOption   --   A   function   with  signature  setOption(varName,  parsingFn=None,
                            checkFn=None, default=None) used to update run configuration

JOB.SERVICE API

       The Service class allows databases and servers to be spawned within a Toil workflow.

       class Job.Service(memory=None, cores=None, disk=None, preemptable=None, unitName=None)
              Abstract class used to define the interface to a service.

              __init__(memory=None, cores=None, disk=None, preemptable=None, unitName=None)
                     Memory,   core   and   disk   requirements   are   specified   identically   to    as    in
                     toil.job.Job.__init__().

              start(job)
                     Start the service.

                     Parameters
                            job  (toil.job.Job) -- The underlying job that is being run. Can be used to register
                            deferred functions, or to access the fileStore for creating temporary files.

                     Returns
                            An object describing how to access the service. The object must  be  pickleable  and
                            will be used by jobs to access the service (see toil.job.Job.addService()).

              stop(job)
                     Stops the service. Function can block until complete.

                     Parameters
                            job  (toil.job.Job) -- The underlying job that is being run. Can be used to register
                            deferred functions, or to access the fileStore for creating temporary files.

              check()
                     Checks the service is still running.

                     Raises exceptions.RuntimeError -- If the service failed, this will cause the service job to
                            be labeled failed.

                     Returns
                            True if the service is still running, else False. If False then the service job will
                            be terminated, and considered a success. Important point: if the service  job  exits
                            due to a failure, it should raise a RuntimeError, not return False!

EXCEPTIONS API

       Toil specific exceptions.

       exception toil.job.JobException(message)
              General job exception.

              __init__(message)
                     Initialize self.  See help(type(self)) for accurate signature.

       exception toil.job.JobGraphDeadlockException(string)
              An  exception raised in the event that a workflow contains an unresolvable     dependency, such as
              a cycle. See toil.job.Job.checkJobGraphForDeadlocks().

              __init__(string)
                     Initialize self.  See help(type(self)) for accurate signature.

       exception toil.jobStores.abstractJobStore.ConcurrentFileModificationException(jobStoreFileID)
              Indicates that the file was attempted to be modified by multiple processes at once.

              __init__(jobStoreFileID)

                     Parameters
                            jobStoreFileID (str) -- the ID of the file that was modified by multiple workers  or
                            processes concurrently

       exception toil.jobStores.abstractJobStore.JobStoreExistsException(locator)
              Indicates that the specified job store already exists.

              __init__(locator)
                     Initialize self.  See help(type(self)) for accurate signature.

       exception toil.jobStores.abstractJobStore.NoSuchFileException(jobStoreFileID, customName=None, *extra)
              Indicates that the specified file does not exist.

              __init__(jobStoreFileID, customName=None, *extra)

                     ParametersjobStoreFileID (str) -- the ID of the file that was mistakenly assumed to exist

                            • customName (str) -- optionally, an alternate name for the nonexistent file

                            • extra (list) -- optional extra information to add to the error message

       exception toil.jobStores.abstractJobStore.NoSuchJobException(jobStoreID)
              Indicates that the specified job does not exist.

              __init__(jobStoreID)

                     Parameters
                            jobStoreID (str) -- the jobStoreID that was mistakenly assumed to exist

       exception toil.jobStores.abstractJobStore.NoSuchJobStoreException(locator)
              Indicates that the specified job store does not exist.

              __init__(locator)
                     Initialize self.  See help(type(self)) for accurate signature.

RUNNING TESTS

       Test  make  targets,  invoked  as  $  make  <target>, subject to which environment variables are set (see
       Running Integration Tests).

                           ┌───────────────────────┬───────────────────────────────────────┐
                           │TARGET                 │ DESCRIPTION                           │
                           ├───────────────────────┼───────────────────────────────────────┤
                           │test                   │ Invokes all tests.                    │
                           ├───────────────────────┼───────────────────────────────────────┤
                           │integration_test       │ Invokes only the integration tests.   │
                           ├───────────────────────┼───────────────────────────────────────┤
                           │test_offline           │ Skips building the  Docker  appliance │
                           │                       │ and  only  invokes tests that have no │
                           │                       │ docker dependencies.                  │
                           ├───────────────────────┼───────────────────────────────────────┤
                           │integration_test_local │ Makes  integration  tests  easier  to │
                           │                       │ debug    locally   by   running   the │
                           │                       │ integration   tests   serially    and │
                           │                       │ doesn't  redirect  output. This makes │
                           │                       │ it  appears  on   the   terminal   as │
                           │                       │ expected.                             │
                           └───────────────────────┴───────────────────────────────────────┘

       Before  running  tests  for  the  first  time, initialize your virtual environment following the steps in
       buildFromSource.

       Run all tests (including slow tests):

          $ make test

       Run only quick tests (as of Jul 25, 2018, this was ~ 20 minutes):

          $ export TOIL_TEST_QUICK=True; make test

       Run an individual test with:

          $ make test tests=src/toil/test/sort/sortTest.py::SortTest::testSort

       The default value for tests is "src" which includes all tests in the src/  subdirectory  of  the  project
       root.  Tests that require a particular feature will be skipped implicitly. If you want to explicitly skip
       tests that depend on a currently installed feature, use

          $ make test tests="-m 'not aws' src"

       This will run only the tests that don't depend on  the  aws  extra,  even  if  that  extra  is  currently
       installed.  Note  the distinction between the terms feature and extra. Every extra is a feature but there
       are features that are not extras, such as the gridengine and parasol features.  To skip  tests  involving
       both the parasol feature and the aws extra, use the following:

          $ make test tests="-m 'not aws and not parasol' src"

   Running Tests with pytest
       Often  it  is simpler to use pytest directly, instead of calling the make wrapper.  This usually works as
       expected, but some tests need some manual preparation. To run  a  specific  test  with  pytest,  use  the
       following:

          python -m pytest src/toil/test/sort/sortTest.py::SortTest::testSort

       For more information, see the pytest documentation.

   Running Integration Tests
       These  tests are generally only run using in our CI workflow due to their resource requirements and cost.
       However, they can be made available for local testing:

          • Running tests that make use of Docker (e.g. autoscaling tests and Docker tests) require an appliance
            image  to  be  hosted.  First, make sure you have gone through the set up found in Using Docker with
            Quay.  Then to build and host the appliance image run the make target push_docker.

                $ make push_docker

          • Running integration tests require activation via  an  environment  variable  as  well  as  exporting
            information relevant to the desired tests. Enable the integration tests:

                $ export TOIL_TEST_INTEGRATIVE=True

          • Finally, set the environment variables for keyname and desired zone:

                $ export TOIL_X_KEYNAME=[Your Keyname]
                $ export TOIL_X_ZONE=[Desired Zone]

            Where X is one of our currently supported cloud providers (GCE, AWS).

          • See the above sections for guidance on running tests.

   Test Environment Variables
                            ┌──────────────────────┬───────────────────────────────────────┐
                            │TOIL_TEST_TEMP        │ An absolute path to a directory where │
                            │                      │ Toil tests will write their temporary │
                            │                      │ files.   Defaults   to  the  system's │
                            │                      │ standard temporary directory.         │
                            ├──────────────────────┼───────────────────────────────────────┤
                            │TOIL_TEST_INTEGRATIVE │ If True, this allows the  integration │
                            │                      │ tests to run. Only valid when running │
                            │                      │ the tests from the  source  directory │
                            │                      │ via make test or make test_parallel.  │
                            ├──────────────────────┼───────────────────────────────────────┤
                            │TOIL_AWS_KEYNAME      │ An   AWS  keyname  (see  prepareAWS), │
                            │                      │ which is  required  to  run  the  AWS │
                            │                      │ tests.                                │
                            ├──────────────────────┼───────────────────────────────────────┤
                            │TOIL_GOOGLE_PROJECTID │ A Google Cloud account projectID (see │
                            │                      │ runningGCE), which is required to  to │
                            │                      │ run the Google Cloud tests.           │
                            └──────────────────────┴───────────────────────────────────────┘

                            │TOIL_TEST_QUICK       │ If   True,  long  running  tests  are │
                            │                      │ skipped.                              │
                            └──────────────────────┴───────────────────────────────────────┘

          Partial install and failing tests

                 Some tests may fail with an ImportError if the required extras are not installed.  Install Toil
                 with all of the extras do prevent such errors.

   Using Docker with Quay
       Docker  is needed for some of the tests. Follow the appropriate installation instructions for your system
       on their website to get started.

       When running make test you might still get the following error:

          $ make test
          Please set TOIL_DOCKER_REGISTRY, e.g. to quay.io/USER.

       To solve, make an account with Quay and specify it like so:

          $ TOIL_DOCKER_REGISTRY=quay.io/USER make test

       where USER is your Quay username.

       For convenience you may want to add this variable to your bashrc by running

          $ echo 'export TOIL_DOCKER_REGISTRY=quay.io/USER' >> $HOME/.bashrc

   Running Mesos Tests
       If you're running Toil's Mesos tests, be sure to create the  virtualenv  with  --system-site-packages  to
       include  the  Mesos Python bindings. Verify this by activating the virtualenv and running pip list | grep
       mesos. On macOS, this may come up empty. To fix it, run the following:

          for i in /usr/local/lib/python2.7/site-packages/*mesos*; do ln -snf $i venv/lib/python2.7/site-packages/; done

DEVELOPING WITH DOCKER

       To develop on features reliant on the Toil Appliance (the docker image toil uses  for  AWS  autoscaling),
       you  should  consider  setting  up  a personal registry on Quay or Docker Hub. Because the Toil Appliance
       images are tagged with the Git commit they are based on and because only commits  on  our  master  branch
       trigger  an  appliance  build  on Quay, as soon as a developer makes a commit or dirties the working copy
       they will no longer be able to rely on Toil to automatically detect  the  proper  Toil  Appliance  image.
       Instead,  developers wishing to test any appliance changes in autoscaling should build and push their own
       appliance image to a personal Docker registry.  This is described in the next section.

   Making Your Own Toil Docker Image
       Note!  Toil checks if the docker image specified by TOIL_APPLIANCE_SELF  exists  prior  to  launching  by
       using the docker v2 schema.  This should be valid for any major docker repository, but there is an option
       to override this if desired using the option: -\-forceDockerAppliance.

       Here is a general workflow (similar instructions apply when using Docker Hub):

       1. Make some changes to the provisioner of your local version of Toil

       2. Go to the location where you installed the Toil source code and run

             $ make docker

          to automatically build a docker image that can now be uploaded to your personal Quay account.  If  you
          have not installed Toil source code yet see buildFromSource.

       3. If  it's  not  already you will need Docker installed and need to log into Quay. Also you will want to
          make sure that your Quay account is public.

       4. Set the environment variable TOIL_DOCKER_REGISTRY to your Quay account. If  you  find  yourself  doing
          this often you may want to add

             export TOIL_DOCKER_REGISTRY=quay.io/<MY_QUAY_USERNAME>

          to your .bashrc or equivalent.

       5. Now you can run

             $ make push_docker

          which  will  upload  the  docker image to your Quay account. Take note of the image's tag for the next
          step.

       6. Finally you will need to tell Toil from where to pull the Appliance image you've created (it uses  the
          Toil   release   you   have   installed   by  default).  To  do  this  set  the  environment  variable
          TOIL_APPLIANCE_SELF to the url of your image. For more info see envars.

       7. Now you can launch your cluster! For more information see Autoscaling.

   Running a Cluster Locally
       The Toil Appliance container can also be useful as a test  environment  since  it  can  simulate  a  Toil
       cluster  locally. An important caveat for this is autoscaling, since autoscaling will only work on an EC2
       instance and cannot (at this time) be run on a local machine.

       To spin up a local cluster, start by using the following Docker run  command  to  launch  a  Toil  leader
       container:

          docker run --entrypoint=mesos-master --net=host -d --name=leader --volume=/home/jobStoreParentDir:/jobStoreParentDir quay.io/ucsc_cgl/toil:3.6.0 --registry=in_memory --ip=127.0.0.1 --port=5050 --allocation_interval=500ms

       A  couple notes on this command: the -d flag tells Docker to run in daemon mode so the container will run
       in the background. To verify that the container is running you can run docker ps to see  all  containers.
       If  you want to run your own container rather than the official UCSC container you can simply replace the
       quay.io/ucsc_cgl/toil:3.6.0 parameter with your own container name.

       Also note that we are not mounting the job store directory itself, but rather the location where the  job
       store  will  be  written.  Due  to  complications with running Docker on MacOS, I recommend only mounting
       directories within your home directory. The next command will  launch  the  Toil  worker  container  with
       similar parameters:

          docker run --entrypoint=mesos-slave --net=host -d --name=worker --volume=/home/jobStoreParentDir:/jobStoreParentDir quay.io/ucsc_cgl/toil:3.6.0 --work_dir=/var/lib/mesos --master=127.0.0.1:5050 --ip=127.0.0.1 —-attributes=preemptable:False --resources=cpus:2

       Note here that we are specifying 2 CPUs and a non-preemptable worker. We can easily change either or both
       of these in a logical way. To change the number of cores we can change the 2 to whatever number you like,
       and  to  change  the  worker to be preemptable we change preemptable:False to preemptable:True. Also note
       that the same volume is mounted into the worker. This is needed since both the leader  and  worker  write
       and read from the job store. Now that your cluster is running, you can run

          docker exec -it leader bash

       to  get a shell in your leader 'node'. You can also replace the leader parameter with worker to get shell
       access in your worker.

          Docker-in-Docker issues

                 If you want to run Docker inside this Docker cluster (Dockerized tools,  perhaps),  you  should
                 also  mount  in  the Docker socket via -v /var/run/docker.sock:/var/run/docker.sock.  This will
                 give the Docker client inside the Toil Appliance access to  the  Docker  engine  on  the  host.
                 Client/engine  version mismatches have been known to cause issues, so we recommend using Docker
                 version 1.12.3 on the host to be compatible with the Docker client installed in the  Appliance.
                 Finally, be careful where you write files inside the Toil Appliance - 'child' Docker containers
                 launched in the Appliance will actually be siblings to the Appliance since the Docker engine is
                 located  on  the  host.  This means that the 'child' container can only mount in files from the
                 Appliance if the files are located  in  a  directory  that  was  originally  mounted  into  the
                 Appliance  from the host - that way the files are accessible to the sibling container. Note: if
                 Docker can't find the file/directory on the host it will silently fail and mount  in  an  empty
                 directory.

MAINTAINER'S GUIDELINES

       In general, as developers and maintainers of the code, we adhere to the following guidelines:

       • We strive to never break the build on master. All development should be done on branches, in either the
         main Toil repository or in developers' forks.

       • Pull requests should be used for any and all changes (except truly trivial ones).

       • Pull requests should be in response to issues. If you find yourself making a pull  request  without  an
         issue, you should create the issue first.

   Naming ConventionsCommit messages should be great. Most importantly, they must:

         • Have  a short subject line. If in need of more space, drop down two lines and write a body to explain
           what is changing and why it has to change.

         • Write the subject line as a command: Destroy all humans, not All humans destroyed.

         • Reference the issue being fixed in a Github-parseable format, such as (resolves #1234) at the end  of
           the  subject  line,  or  This  will fix #1234.  somewhere in the body. If no single commit on its own
           fixes the issue, the cross-reference must appear in the pull request title or body instead.

       • Branches in the main Toil repository must start with issues/, followed by the issue number (or numbers,
         separated  by a dash), followed by a short, lowercase, hyphenated description of the change. (There can
         be many open pull requests with their  associated  branches  at  any  given  point  in  time  and  this
         convention ensures that we can easily identify branches.)

         Say   there  is  an  issue  numbered  #123  titled  Foo  does  not  work.  The  branch  name  would  be
         issues/123-fix-foo and the title of the commit would be Fix foo in case of bar (resolves #123).

   Pull Requests
       • All pull requests must be reviewed by a person other than the request's author.

       • Modified pull requests must be re-reviewed before merging. Note that Github does not enforce this!

       • Pull requests will not be merged unless Travis and Gitlab CI tests pass.  Gitlab tests are only run  on
         code  in the main Toil repository on some branch, so it is the responsibility of the approving reviewer
         to make sure that pull  requests  from  outside  repositories  are  copied  to  branches  in  the  main
         repository. This can be accomplished with (from a Toil clone):

            ./contrib/admin/test-pr theirusername their-branch issues/123-fix-description-here

         This  must  be  repeated  every  time the PR submitter updates their PR, after checking to see that the
         update is not malicious.

         If there is no issue corresponding to the PR, after which the branch can be named, the reviewer of  the
         PR should first create the issue.

         Developers  who have push access to the main Toil repository are encouraged to make their pull requests
         from within the repository, to avoid this step.

       • Prefer using "Squash and marge" when merging pull requests to master especially when the PR contains  a
         "single  unit"  of  work  (i.e. if one were to rewrite the PR from scratch with all the fixes included,
         they would have one commit for the entire PR). This makes the commit history on  master  more  readable
         and easier to debug in case of a breakage.

         When squashing a PR from multiple authors, please add Co-authored-by to give credit to all contributing
         authors.

         See Issue #2816 for more details.

TOIL ARCHITECTURE

       The following diagram layouts out the software architecture of Toil.
         [image: Toil's architecture is composed of the leader, the job store, the worker processes,  the  batch
         system,  the  node  provisioner,  and  the  stats  and  logging  monitor.]  [image] Figure 1: The basic
         components of Toil's architecture..UNINDENT

       These components are described below:the leader:
                       The leader is responsible for deciding which jobs should be run. To do this it  traverses
                       the  job graph. Currently this is a single threaded process, but we make aggressive steps
                       to prevent it becoming a bottleneck (see Read-only Leader described below).

              •

                the job-store:
                       Handles all files shared between the components. Files in the job-store are the means  by
                       which  the  state  of the workflow is maintained. Each job is backed by a file in the job
                       store, and atomic updates to this state are used to ensure the  workflow  can  always  be
                       resumed  upon  failure.  The job-store can also store all user files, allowing them to be
                       shared between jobs. The job-store is defined by the  AbstractJobStore  class.   Multiple
                       implementations of this class allow Toil to support different back-end file stores, e.g.:
                       S3, network file systems, Google file store, etc.

              •

                workers:
                       The workers are temporary processes responsible for running  jobs,  one  at  a  time  per
                       worker.  Each  worker  process  is invoked with a job argument that it is responsible for
                       running. The worker monitors this job and reports back success or failure to  the  leader
                       by  editing  the  job's  state  in the file-store.  If the job defines successor jobs the
                       worker may choose to immediately run them (see Job Chaining below).

              •

                the batch-system:
                       Responsible for scheduling the jobs given to it by the leader, creating a worker  command
                       for  each  job.  The batch-system is defined by the AbstractBatchSystem class.  Toil uses
                       multiple existing batch systems to schedule jobs, including Apache Mesos, GridEngine  and
                       a multi-process single node implementation that allows workflows to be run without any of
                       these frameworks. Toil can therefore fairly easily be made to run  a  workflow  using  an
                       existing cluster.

              •

                the node provisioner:
                       Creates  worker  nodes in which the batch system schedules workers.  It is defined by the
                       AbstractProvisioner class.

              •

                the statistics and logging monitor:
                       Monitors logging and statistics produced by  the  workers  and  reports  them.  Uses  the
                       job-store to gather this information.

   Optimizations
       Toil  implements  lots  of  optimizations  designed  for  scalability.   Here  we  detail some of the key
       optimizations.

   Read-only leader
       The leader process is currently implemented as a single thread. Most of the leader's tasks revolve around
       processing  the  state of jobs, each stored as a file within the job-store.  To minimise the load on this
       thread, each worker does as much work as possible to manage the state of the job  it  is  running.  As  a
       result, with a couple of minor exceptions, the leader process never needs to write or update the state of
       a job within the job-store.  For example, when a job is  complete  and  has  no  further  successors  the
       responsible  worker  deletes the job from the job-store, marking it complete. The leader then only has to
       check for the existence of the file when it receives a signal from the batch-system to know that the  job
       is complete. This off-loading of state management is orthogonal to future parallelization of the leader.

   Job chaining
       The  scheduling  of  successor jobs is partially managed by the worker, reducing the number of individual
       jobs the leader needs to process. Currently this is very simple: if the there is a single next  successor
       job  to run and its resources fit within the resources of the current job and closely match the resources
       of the current job then the job is run immediately on the worker without returning to the leader. Further
       extensions  of  this  strategy  are  possible,  but  for  many  workflows which define a series of serial
       successors (e.g. map sequencing reads, post-process mapped reads, etc.) this pattern is very effective at
       reducing leader workload.

   Preemptable node support
       Critical to running at large-scale is dealing with intermittent node failures. Toil is therefore designed
       to always be resumable providing the job-store does not become corrupt.  This robustness allows  Toil  to
       run  on  preemptible nodes, which are only available when others are not willing to pay more to use them.
       Designing workflows that divide into many short individual jobs that can use preemptable nodes allows for
       workflows to be efficiently scheduled and executed.

   Caching
       Running bioinformatic pipelines often require the passing of large datasets between jobs. Toil caches the
       results from jobs such that child jobs running on the same node can directly use the same  file  objects,
       thereby  eliminating  the  need  for  an intermediary transfer to the job store. Caching also reduces the
       burden on the local disks, because multiple jobs can share a single file.   The  resulting  drop  in  I/O
       allows  pipelines  to run faster, and, by the sharing of files, allows users to run more jobs in parallel
       by reducing overall disk requirements.

       To demonstrate the efficiency of caching, we ran an experimental internal pipeline on 3 samples from  the
       TCGA  Lung  Squamous  Carcinoma (LUSC) dataset. The pipeline takes the tumor and normal exome fastqs, and
       the tumor rna fastq and input, and predicts MHC presented neoepitopes in the patient that  are  potential
       targets  for T-cell based immunotherapies. The pipeline was run individually on the samples on c3.8xlarge
       machines on AWS (60GB RAM,600GB SSD storage, 32 cores).  The  pipeline  aligns  the  data  to  hg19-based
       references,  predicts  MHC haplotypes using PHLAT, calls mutations using 2 callers (MuTect and RADIA) and
       annotates them using SnpEff, then predicts MHC:peptide binding using  the  IEDB  suite  of  tools  before
       running an in-house rank boosting algorithm on the final calls.

       To  optimize time taken, The pipeline is written such that mutations are called on a per-chromosome basis
       from the whole-exome bams and are merged into a complete vcf. Running mutect in parallel on  whole  exome
       bams requires each mutect job to download the complete Tumor and Normal Bams to their working directories
       -- An operation that quickly fills the disk and limits the parallelizability of jobs. The script was  run
       in  Toil,  with  and  without caching, and Figure 2 shows that the workflow finishes faster in the cached
       case while using less disk on average than the uncached run. We believe that benefits of caching  arising
       from  file  transfers  will  be much higher on magnetic disk-based storage systems as compared to the SSD
       systems we tested this on.
         [image: Graph outlining the efficiency gain from caching.]  [image]  Figure  2:  Efficiency  gain  from
         caching.  The lower half of each plot describes the disk used by the pipeline recorded every 10 minutes
         over the duration of the pipeline, and the upper half shows the corresponding  stage  of  the  pipeline
         that  is being processed. Since jobs requesting the same file shared the same inode, the effective load
         on the disk is considerably lower than in the uncached case where every job downloads a  personal  copy
         of  every file it needs. We see that in all cases, the uncached run uses almost 300-400GB more that the
         cached run in the resource heavy mutation calling step. We also see a benefit in terms of wall time for
         each stage since we eliminate the time taken for file transfers..UNINDENT

   Toil support for Common Workflow Language
       The  CWL  document  and  input  document  are loaded using the 'cwltool.load_tool' module.  This performs
       normalization and URI expansion (for example, relative file references  are  turned  into  absolute  file
       URIs),  validates  the document against the CWL schema, initializes Python objects corresponding to major
       document elements (command line tools, workflows, workflow steps), and performs static type checking that
       sources and sinks have compatible types.

       Input files referenced by the CWL document and input document are imported into the Toil file store.  CWL
       documents may use any URI scheme supported by Toil file store, including local files and object storage.

       The 'location' field of File references are updated to reflect the import token returned by the Toil file
       store.

       For  directory  inputs,  the  directory  listing is stored in Directory object.  Each individual files is
       imported into Toil file store.

       An initial workflow Job is created from the toplevel CWL document.  Then,  control  passes  to  the  Toil
       engine which schedules the initial workflow job to run.

       When  the toplevel workflow job runs, it traverses the CWL workflow and creates a toil job for each step.
       The dependency graph is expressed by making downstream jobs children of upstream jobs,  and  initializing
       the child jobs with an input object containing the promises of output from upstream jobs.

       Because Toil jobs have a single output, but CWL permits steps to have multiple output parameters that may
       feed into multiple other steps, the input to a CWLJob is expressed with an "indirect  dictionary".   This
       is  a  dictionary  of input parameters, where each entry value is a tuple of a promise and a promise key.
       When the job runs, the indirect dictionary is turned into a  concrete  input  object  by  resolving  each
       promise  into  its  actual value (which is always a dict), and then looking up the promise key to get the
       actual value for the the input parameter.

       If a workflow step specifies a scatter, then a scatter job is created and  connected  into  the  workflow
       graph  as  described above.  When the scatter step runs, it creates child jobs for each parameterizations
       of the scatter.  A gather job is added as a follow-on to gather the outputs into arrays.

       When running a command line tool, it first creates output and temporary directories under the Toil  local
       temp  dir.   It  runs  the  command  line  tool  using  the single_job_executor from CWLTool, providing a
       Toil-specific  constructor  for  filesystem  access,  and  overriding  the  default  PathMapper  to   use
       ToilPathMapper.

       The  ToilPathMapper  keeps track of a file's symbolic identifier (the Toil FileID), its local path on the
       host (the value returned by readGlobalFile) and the the location of the file inside the Docker container.

       After executing single_job_executor from CWLTool, it gets back the output  object  and  status.   If  the
       underlying  job  failed,  raise  an  exception.  Files from the output object are added to the file store
       using writeGlobalFile and the 'location' field of File  references  are  updated  to  reflect  the  token
       returned by the Toil file store.

       When  the  workflow  completes, it returns an indirect dictionary linking to the outputs of the job steps
       that contribute to the final output.  This is the value returned by toil.start() or toil.restart().  This
       is resolved to get the final output object.  The files in this object are exported from the file store to
       'outdir' on the host file system, and the 'location' field of File references are updated to reflect  the
       final exported location of the output files.

MINIMUM AWS IAM PERMISSIONS

       Toil requires at least the following permissions in an IAM role to operate on a cluster.  These are added
       by default when launching a cluster. However, ensure that they are present if creating a custom IAM  role
       when launching a cluster with the --awsEc2ProfileArn parameter.

          {
              "Version": "2012-10-17",
              "Statement": [
                  {
                      "Effect": "Allow",
                      "Action": [
                          "ec2:*",
                          "s3:*",
                          "sdb:*",
                          "iam:PassRole"
                      ],
                      "Resource": "*"
                  }
              ]
          }

AUTO-DEPLOYMENT

       If you want to run your workflow in a distributed environment, on multiple worker machines, either in the
       cloud or on a bare-metal cluster, your script needs to be made available to those other machines. If your
       script  imports  other  modules,  those  modules  also need to be made available on the workers. Toil can
       automatically do that for you, with a little help on your part. We call this feature auto-deployment of a
       workflow.

       Let's first examine various scenarios of auto-deploying a workflow, which, as we'll see shortly cannot be
       auto-deployed. Lastly, we'll deal with the issue of declaring Toil as a dependency of a workflow that  is
       packaged as a setuptools distribution.

       Toil  can  be easily deployed to a remote host. First, assuming you've followed our prepareAWS section to
       install Toil and use it to create a remote leader node on (in this example) AWS, you  can  now  log  into
       this  into using sshCluster and once on the remote host, create and activate a virtualenv (noting to make
       sure to use the --system-site-packages option!):

          $ virtualenv --system-site-packages venv
          $ . venv/bin/activate

       Note the --system-site-packages option, which ensures that  globally-installed  packages  are  accessible
       inside  the  virtualenv.   Do  not  (re)install  Toil  after this!  The --system-site-packages option has
       already transferred Toil and the dependencies from your local installation of Toil for you.

       From here, you can install a project and its dependencies:

          $ tree
          .
          ├── util
          │   ├── __init__.py
          │   └── sort
          │       ├── __init__.py
          │       └── quick.py
          └── workflow
              ├── __init__.py
              └── main.py

          3 directories, 5 files
          $ pip install matplotlib
          $ cp -R workflow util venv/lib/python2.7/site-packages

       Ideally, your project would have a setup.py file (see  setuptools)  which  streamlines  the  installation
       process:

          $ tree
          .
          ├── util
          │   ├── __init__.py
          │   └── sort
          │       ├── __init__.py
          │       └── quick.py
          ├── workflow
          │   ├── __init__.py
          │   └── main.py
          └── setup.py

          3 directories, 6 files
          $ pip install .

       Or, if your project has been published to PyPI:

          $ pip install my-project

       In  each case, we have created a virtualenv with the --system-site-packages flag in the venv subdirectory
       then installed the matplotlib distribution from PyPI  along  with  the  two  packages  that  our  project
       consists of. (Again, both Python and Toil are assumed to be present on the leader and all worker nodes.)

       We can now run our workflow:

          $ python main.py --batchSystem=mesos …

       IMPORTANT:
          If  workflow's  external dependencies contain native code (i.e. are not pure Python) then they must be
          manually installed on each worker.

       WARNING:
          Neither python setup.py develop nor pip install -e . can be  used  in  this  process  as,  instead  of
          copying  the  source files, they create .egg-link files that Toil can't auto-deploy. Similarly, python
          setup.py install doesn't work either as it installs the project as a Python .egg  which  is  also  not
          currently supported by Toil (though it could be in the future).

          Also  note  that  using  the  --single-version-externally-managed  flag with setup.py will prevent the
          installation of your package as an .egg. It will also  disable  the  automatic  installation  of  your
          project's dependencies.

   Auto Deployment with Sibling Modules
       This scenario applies if the user script imports modules that are its siblings:

          $ cd my_project
          $ ls
          userScript.py utilities.py
          $ ./userScript.py --batchSystem=mesos …

       Here  userScript.py  imports additional functionality from utilities.py.  Toil detects that userScript.py
       has sibling modules and copies them to the workers, alongside the user script. Note that sibling  modules
       will  be  auto-deployed regardless of whether they are actually imported by the user script–all .py files
       residing in the same directory as the user script will automatically be auto-deployed.

       Sibling modules are a suitable method of organizing the source code of reasonably complicated workflows.

   Auto-Deploying a Package Hierarchy
       Recall that in Python, a package is a directory containing one or more .py files—one  of  which  must  be
       called  __init__.py—and optionally other packages. For more involved workflows that contain a significant
       amount of code, this is the recommended way of organizing the source  code.  Because  we  use  a  package
       hierarchy,  we  can't  really refer to the user script as such, we call it the user module instead. It is
       merely one of the modules in the package hierarchy. We need to inform Toil that we want to use a  package
       hierarchy  by  invoking  Python's  -m  option.  That  enables  Toil to identify the entire set of modules
       belonging to the workflow and copy all of them to each worker. Note that while using  the  -m  option  is
       optional in the scenarios above, it is mandatory in this one.

       The following shell session illustrates this:

          $ cd my_project
          $ tree
          .
          ├── utils
          │   ├── __init__.py
          │   └── sort
          │       ├── __init__.py
          │       └── quick.py
          └── workflow
              ├── __init__.py
              └── main.py

          3 directories, 5 files
          $ python -m workflow.main --batchSystem=mesos …

       Here  the  user  module main.py does not reside in the current directory, but is part of a package called
       util, in a subdirectory of the current directory. Additional functionality is in a separate module called
       util.sort.quick  which corresponds to util/sort/quick.py. Because we invoke the user module via python -m
       workflow.main, Toil can determine the root directory of the hierarchy–my_project in  this  case–and  copy
       all Python modules underneath it to each worker. The -m option is documented here

       When -m is passed, Python adds the current working directory to sys.path, the list of root directories to
       be considered when resolving a module name like workflow.main. Without that added convenience  we'd  have
       to  run  the  workflow as PYTHONPATH="$PWD" python -m workflow.main. This also means that Toil can detect
       the root directory of the user module's package hierarchy even if it isn't the current working directory.
       In other words we could do this:

          $ cd my_project
          $ export PYTHONPATH="$PWD"
          $ cd /some/other/dir
          $ python -m workflow.main --batchSystem=mesos …

       Also note that the root directory itself must not be package, i.e. must not contain an __init__.py.

   Relying on Shared Filesystems
       Bare-metal  clusters  typically mount a shared file system like NFS on each node.  If every node has that
       file system mounted at the same path, you can place your project on that shared filesystem and  run  your
       user script from there.  Additionally, you can clone the Toil source tree into a directory on that shared
       file system and you won't even need to install Toil on every worker. Be sure to  add  both  your  project
       directory and the Toil clone to PYTHONPATH. Toil replicates PYTHONPATH from the leader to every worker.

          Using a shared filesystem

                 Toil currently only supports a tempdir set to a local, non-shared directory.

   Toil Appliance
       The  term  Toil Appliance refers to the Mesos Docker image that Toil uses to simulate the machines in the
       virtual mesos cluster.  It's easily deployed, only needs Docker, and allows for workflows to  be  run  in
       single-machine  mode  and  for  clusters of VMs to be provisioned.  To specify a different image, see the
       Toil envars section.  For more information on the Toil Appliance, see the runningAWS section.

ENVIRONMENT VARIABLES

       There are several environment variables that affect the way Toil runs.

                       ┌────────────────────────────────┬───────────────────────────────────────┐
                       │TOIL_CHECK_ENV                  │ A flag that determines  whether  Toil │
                       │                                │ will  try  to  refer back to a Python │
                       │                                │ virtual environment in  which  it  is │
                       │                                │ installed   when  composing  commands │
                       │                                │ that may be run on  other  hosts.  If │
                       │                                │ set  to True, if Toil is installed in │
                       │                                │ the current virtual  environment,  it │
                       │                                │ will  use  absolute  paths to its own │
                       │                                │ executables    (and    the    virtual │
                       │                                │ environment must thus be available on │
                       │                                │ at  the  same  path  on  all  nodes). │
                       │                                │ Otherwise,   Toil  internal  commands │
                       │                                │ such as _toil_worker will be resolved │
                       │                                │ according  to  the  PATH  on the node │
                       │                                │ where they are executed. This setting │
                       │                                │ can   be   useful  in  a  shared  HPC │
                       │                                │ environment,  where  users  may  have │
                       │                                │ their   own   Toil  installations  in │
                       │                                │ virtual environments.                 │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_WORKDIR                    │ An absolute path to a directory where │
                       │                                │ Toil  will write its temporary files. │
                       │                                │ This directory  must  exist  on  each │
                       │                                │ worker  node  and  may  be  set  to a │
                       │                                │ different value on each  worker.  The │
                       │                                │ --workDir    command    line   option │
                       │                                │ overrides  this.  On   Mesos   nodes, │
                       │                                │ TOIL_WORKDIR  generally  defaults  to │
                       │                                │ the   Mesos   sandbox,   except    on │
                       │                                │ CGCloud-provisioned  nodes  where  it │
                       │                                │ defaults to  /var/lib/mesos.  In  all │
                       │                                │ other  cases,  the  system's standard │
                       │                                │ temporary directory is used.          │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_KUBERNETES_OWNER           │ A name prefix for easy identification │
                       │                                │ of  Kubernetes jobs. If not set, Toil │
                       │                                │ will use the current user name.       │
                       └────────────────────────────────┴───────────────────────────────────────┘

                       │TOIL_APPLIANCE_SELF             │ The fully qualified reference for the │
                       │                                │ Toil  Appliance  you  wish to use, in │
                       │                                │ the       form        REPO/IMAGE:TAG. │
                       │                                │ quay.io/ucsc_cgl/toil:3.6.0       and │
                       │                                │ cket/toil:3.5.0 are both examples  of │
                       │                                │ valid options. Note that since Docker │
                       │                                │ defaults  to  Dockerhub  repos,  only │
                       │                                │ quay.io  repos  need to specify their │
                       │                                │ registry.                             │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_DOCKER_REGISTRY            │ The URL of the registry of  the  Toil │
                       │                                │ Appliance  image  you  wish  to  use. │
                       │                                │ Docker will use Dockerhub by default, │
                       │                                │ but the quay.io registry is also very │
                       │                                │ popular  and  easily  specifiable  by │
                       │                                │ setting this option to quay.io.       │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_DOCKER_NAME                │ The  name of the Toil Appliance image │
                       │                                │ you wish to use.  Generally  this  is │
                       │                                │ simply   toil   but  this  option  is │
                       │                                │ provided to override this, since  the │
                       │                                │ image  can  be  built  with arbitrary │
                       │                                │ names.                                │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_AWS_SECRET_NAME            │ For the Kubernetes batch system,  the │
                       │                                │ name  of  a  Kubernetes  secret which │
                       │                                │ contains a credentials file  granting │
                       │                                │ access  to  AWS  resources.  Will  be │
                       │                                │ mounted     as     ~/.aws      inside │
                       │                                │ Kubernetes-managed  Toil  containers. │
                       │                                │ Enables the AWSJobStore  to  be  used │
                       │                                │ with  the Kubernetes batch system, if │
                       │                                │ the credentials allow  access  to  S3 │
                       │                                │ and SimpleDB.                         │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_AWS_ZONE                   │ The EC2 zone to provision nodes in if │
                       │                                │ using Toil's provisioner.             │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_AWS_AMI                    │ ID  of  the  AMI  to  use   in   node │
                       │                                │ provisioning.  If in doubt, don't set │
                       │                                │ this variable.                        │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_AWS_NODE_DEBUG             │ Determines whether to preserve  nodes │
                       │                                │ that  have  failed  health checks. If │
                       │                                │ set to  True,  nodes  that  fail  EC2 │
                       │                                │ health  checks  won't  immediately be │
                       │                                │ terminated so they  can  be  examined │
                       │                                │ and  the cause of failure determined. │
                       │                                │ If any EC2 nodes are left  behind  in │
                       │                                │ this  manner, the security group will │
                       │                                │ also be left behind by  necessity  as │
                       │                                │ it   cannot   be  deleted  until  all │
                       │                                │ associated    nodes     have     been │
                       │                                │ terminated.                           │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_SLURM_ARGS                 │ Arguments  for  sbatch  for the slurm │
                       │                                │ batch system.  Do  not  pass  CPU  or │
                       │                                │ memory specifications here.  Instead, │
                       │                                │ define resource requirements for  the │
                       │                                │ job.   There  is no default value for │
                       │                                │ this variable.                        │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_GRIDENGINE_ARGS            │ Arguments for qsub for the gridengine │
                       │                                │ batch  system.  Do  not  pass  CPU or │
                       │                                │ memory specifications here.  Instead, │
                       │                                │ define  resource requirements for the │
                       │                                │ job. There is no  default  value  for │
                       │                                │ this variable.                        │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_GRIDENGINE_PE              │ Parallel  environment  arguments  for │
                       │                                │ qsub and  for  the  gridengine  batch │
                       │                                │ system. There is no default value for │
                       │                                │ this variable.                        │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_TORQUE_ARGS                │ Arguments for  qsub  for  the  Torque │
                       │                                │ batch  system.   Do  not  pass CPU or │
                       │                                │ memory specifications here.  Instead, │
                       │                                │ define  extra  parameters for the job │
                       │                                │ such as queue. Example: -q medium Use │
                       │                                │ TOIL_TORQUE_REQS to pass extra values │
                       │                                │ for  the  -l  resource   requirements │
                       │                                │ parameter.  There is no default value │
                       │                                │ for this variable.                    │
                       └────────────────────────────────┴───────────────────────────────────────┘

                       │TOIL_TORQUE_REQS                │ Arguments    for     the     resource │
                       │                                │ requirements for Torque batch system. │
                       │                                │ Do   not   pass   CPU    or    memory │
                       │                                │ specifications  here. Instead, define │
                       │                                │ extra  resource  requirements  as   a │
                       │                                │ string   that   goes   after  the  -l │
                       │                                │ argument    to     qsub.     Example: │
                       │                                │ walltime=2:00:00,file=50gb  There  is │
                       │                                │ no default value for this variable.   │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_LSF_ARGS                   │ Additional arguments  for  the  LSF's │
                       │                                │ bsub  command.  Instead, define extra │
                       │                                │ parameters for the job such as queue. │
                       │                                │ Example:  -q  medium.   There  is  no │
                       │                                │ default value for this variable.      │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_HTCONDOR_PARAMS            │ Additional parameters to  include  in │
                       │                                │ the  HTCondor  submit  file passed to │
                       │                                │ condor_submit. Do  not  pass  CPU  or │
                       │                                │ memory  specifications  here. Instead │
                       │                                │ define extra parameters which may  be │
                       │                                │ required  by HTCondor.  This variable │
                       │                                │ is parsed  as  a  semicolon-separated │
                       │                                │ string  of  parameter  = value pairs. │
                       │                                │ Example:        requirements        = │
                       │                                │ TARGET.has_sse4_2       ==      true; │
                       │                                │ accounting_group = test.  There is no │
                       │                                │ default value for this variable.      │
                       ├────────────────────────────────┼───────────────────────────────────────┤
                       │TOIL_CUSTOM_DOCKER_INIT_COMMAND │ Any custom bash command to run in the │
                       │                                │ Toil  docker   container   prior   to │
                       │                                │ running  the  Toil  services.  Can be │
                       │                                │ used for any custom initialization in │
                       │                                │ the  worker and/or primary nodes such │
                       │                                │ as     private     docker      docker │
                       │                                │ authentication.  Example for AWS ECR: │
                       │                                │ pip install awscli && eval $(aws  ecr │
                       │                                │ get-login --no-include-email --region │
                       │                                │ us-east-1).                           │
                       └────────────────────────────────┴───────────────────────────────────────┘

       • genindex

       • search

AUTHOR

       UCSC Computational Genomics Lab

COPYRIGHT

       2020 – 2020 UCSC Computational Genomics Lab