bionic (5) sge_accounting.5.gz

Provided by: gridengine-common_8.1.9+dfsg-7build1_all bug

NAME

       accounting - Grid Engine accounting file format

DESCRIPTION

       An  accounting  record is written to the Grid Engine accounting file $SGE_ROOT/$SGE_CELL/common/reporting
       for each finished job if accounting=true is specified in the sge_conf(5) reporting_params.   This  occurs
       at  intervals of the accounting_flush_time specified in the same place.  The accounting file is processed
       by qacct(1) to derive accounting statistics.

       If output to the reporting(5) file is enabled, accounting records containing  similar  data  are  written
       there.   They  include  "intermediate"  records  written at midnight for long-running jobs, not just ones
       written at the end of the jobs, and so may be more appropriate to process  for  some  purposes  than  the
       accounting file.

FORMAT

       Each  job  is  represented  by  a  line  in the accounting file. Empty lines, and lines which contain one
       character or less are ignored  by  qacct.   Accounting  record  entries  are  separated  by  colon  (':')
       characters. The entries denote in their order of appearance:

   1. qname
       Name of the cluster queue in which the job has run.

   2. hostname
       Name of the execution host.

   3. group
       The effective group id of the job owner when executing the job.

   4. owner
       Owner of the Grid Engine job.

   5. job_name
       Job name.

   6. job_number
       Job identifier (job number).

   7. account
       An account string as specified by the qsub(1) or qalter(1) -A option.

   8. priority
       Priority  value  assigned  to the job, corresponding to the priority parameter in the queue configuration
       (see queue_conf(5)).

   9. submission_time
       Submission time in seconds since the Unix epoch (1970-01-01 00:00:00 UTC).

   10. start_time
       Start time in seconds since the epoch.

   11. end_time
       End time in seconds since the epoch.

   12. failed
       Indicates the problem which occurred in case a job failed (at the system level, as  opposed  to  the  job
       script  or  binary having non-zero exit status, see below).  Possibly the job could not be started on the
       execution host (e.g. because the owner of the job did not have a  valid  account  on  that  machine),  or
       didn't finish successfully (e.g. because an execution host crashed).  If Grid Engine tries to start a job
       multiple times, there may be multiple entries in the reporting file corresponding to  the  same  job  ID.
       See sge_status(5) for a list.

   13. exit_status
       Exit  status of the job script (or Grid Engine-specific status in case of certain error conditions).  The
       exit status is determined by following the normal shell conventions.  If the command terminates  normally
       the  value  of the command is its exit status.  However, in the case that the command exits abnormally, a
       value of 0200 (octal), 128 (decimal) is added to the value of the command to make up the exit status.

       For example: If a job dies through signal 9 (SIGKILL) - probably issued by Grid Engine  through  qdel(1),
       or  because  the  job  exceeded  time or memory hard limits - then the exit status is 128 + 9 = 137.  The
       reason Grid Engine killed a job is recorded in the execd messages file at "W" or "I" level, depending  on
       why it was killed.

   14. ru_wallclock
       Difference between end_time and start_time (see above), except that if the job fails, it is zero.

   15. ru_utime
   16. ru_stime
   17. ru_maxrss
   18. ru_ixrss
   19. ru_ismrss
   20. ru_idrss
   21. ru_isrss
   22. ru_minflt
   23. ru_majflt
   24. ru_nswap
   25. ru_inblock
   26. ru_oublock
   27. ru_msgsnd
   28. ru_msgrcv
   29. ru_nsignals
   30. ru_nvcsw
   31. ru_nivcsw
       These  entries  follow  the  contents of the standard Unix rusage structure as described in getrusage(2).
       Depending on the operating system where the job was executed, some of the fields may be 0.

   32. project
       The project which was assigned to the job.

   33. department
       The department which was assigned to the job.

   34. granted_pe
       The parallel environment which was selected for the job.

   35. slots
       The number of slots which were dispatched to the job by the scheduler.

   36. task_number
       Array job task index number.

   37. cpu
       The CPU time usage in seconds.  The value may be affected by the ACCT_RESERVED_USAGE execd parameter (see
       sge_conf(5)).

   38. mem
       The  integral memory usage in Gbytes seconds.  The value may be affected by the ACCT_RESERVED_USAGE execd
       parameter (see sge_conf(5)).

   39. io
       The amount of data transferred in input/output operations in GB (if available, otherwise 0).   On  Linux,
       this  is summed over calls to read(2), pread(2), write(2), and pwrite(2); thus it includes i/o via cache,
       and may not reflect data actually written to filing system.

   40. category
       A string specifying the job category.  This contains a space-separated pseudo options list for  the  job,
       with components as follows:

       -U user_list
              An   owner/group   ACL   list   composed   from   host_conf(5),   sge_pe(5),   And   queue_conf(5)
              user_lists/xuser_lists entries.  Entries from sge_conf(5) are not considered since they  can  only
              cause  a  job  to be accepted/rejected at submit time.  Omitted if there are no such configuration
              entries.

       -P project_list
              Like -U, but for project/xproject entries.

       -u owner
              The owner's user name, if it was referenced in any RQS (see  sge_resource_quota(5)).   Omitted  if
              there was no such reference.

       -q queue_list
              The hard queue list (only if one was specified).

       -masterq queue_list
              The master queue list (only if one was specified).

       -l resource_list
              The hard resource list (only if hard resources were specified).

       -soft -l resource_list
              The soft resource list (only if soft resources were specified).

       -pe pe_name pe_range
              The parallel environment specified for the job (only for parallel jobs).

       -ckpt ckpt_name
              The job's checkpointing environment (only if one was specified).

       -I y   Present only for interactive jobs.

       -ar ar_id
              The advance reservation into which the job was submitted (only if one was specified).

   41. iow
       The input/output wait time in seconds (if available, otherwise 0).

   42. pe_taskid
       If  this  identifier  is  not  equal to NONE, the task was part of a parallel job, and was passed to Grid
       Engine via the qrsh -inherit interface.  Such records are not produced  if  the  PE's  accounting_summary
       parameter is false (see sge_pe(5)).

   43. maxvmem
       The  maximum  vmem  size  in bytes.  The value may be affected by the ACCT_RESERVED_USAGE execd parameter
       (see sge_conf(5)).

   44. arid
       Advance reservation identifier. If the job used the resources of an advance reservation, then this  field
       contains a positive integer identifier; otherwise the value is "0".

   45. ar_sub_time
       Advance  reservation  submission  time if the job uses the resources of an advance reservation; otherwise
       "0".

FILES

       $SGE_ROOT/$SGE_CELL/common/accounting

SEE ALSO

       sge_intro(1),  qacct(1),  qalter(1),  qsub(1),  getrusage(2),  queue_conf(5),   sge_conf(5),   sge_pe(5),
       sge_status(5), reporting(5).

       See sge_intro(1) for a full statement of rights and permissions.