Provided by: erlang-manpages_18.3-dfsg-1ubuntu3.1_all bug

NAME

       heart - Heartbeat Monitoring of an Erlang Runtime System

DESCRIPTION

       This  modules contains the interface to the heart process. heart sends periodic heartbeats
       to an external port program, which is also named heart. The  purpose  of  the  heart  port
       program  is to check that the Erlang runtime system it is supervising is still running. If
       the port program  has  not  received  any  heartbeats  within  HEART_BEAT_TIMEOUT  seconds
       (default  is 60 seconds), the system can be rebooted. Also, if the system is equipped with
       a hardware watchdog timer and is running Solaris, the watchdog can be  used  to  supervise
       the entire system.

       An  Erlang  runtime  system to be monitored by a heart program, should be started with the
       command  line  flag  -heart  (see  also  erl(1)).  The  heart  process  is  then   started
       automatically:

       % erl -heart ...

       If  the  system  should be rebooted because of missing heart-beats, or a terminated Erlang
       runtime system, the environment variable HEART_COMMAND has to be set before the system  is
       started.  If  this variable is not set, a warning text will be printed but the system will
       not reboot. However,  if  the  hardware  watchdog  is  used,  it  will  trigger  a  reboot
       HEART_BEAT_BOOT_DELAY seconds later nevertheless (default is 60).

       To reboot on the WINDOWS platform HEART_COMMAND can be set to heart -shutdown (included in
       the Erlang delivery) or of course to any other  suitable  program  which  can  activate  a
       reboot.

       The  hardware  watchdog  will  not  be  started  under Solaris if the environment variable
       HW_WD_DISABLE is set.

       The HEART_BEAT_TIMEOUT and HEART_BEAT_BOOT_DELAY environment  variables  can  be  used  to
       configure  the heart timeouts, they can be set in the operating system shell before Erlang
       is started or be specified at the command line:

       % erl -heart -env HEART_BEAT_TIMEOUT 30 ...

       The value (in seconds) must be in the range 10 < X <= 65535.

       It should be noted that if the system clock is adjusted with more than  HEART_BEAT_TIMEOUT
       seconds, heart will timeout and try to reboot the system. This can happen, for example, if
       the system clock is adjusted automatically by use of NTP (Network Time Protocol).

       If a crash occurs, an erl_crash.dump will not be written unless the  environment  variable
       ERL_CRASH_DUMP_SECONDS is set.

       % erl -heart -env ERL_CRASH_DUMP_SECONDS 10 ...

       If a regular core dump is wanted, let heart know by setting the kill signal to abort using
       the environment variable HEART_KILL_SIGNAL=SIGABRT. If unset, or not set to  SIGABRT,  the
       default behaviour will be a kill signal using SIGKILL.

       % erl -heart -env HEART_KILL_SIGNAL SIGABRT ...

       Furthermore, ERL_CRASH_DUMP_SECONDS has the following behaviour on heart:

         ERL_CRASH_DUMP_SECONDS=0:
           Suppresses  the  writing a crash dump file entirely, thus rebooting the runtime system
           immediately. This is the same as not setting the environment variable.

         ERL_CRASH_DUMP_SECONDS=-1:
           Setting the environment variable to a negative  value  will  not  reboot  the  runtime
           system until the crash dump file has been completly written.

         ERL_CRASH_DUMP_SECONDS=S:
           Heart  will  wait for S seconds to let the crash dump file be written. After S seconds
           heart will reboot the runtime system regardless  of  the  crash  dump  file  has  been
           written or not.

       In  the  following  descriptions,  all  function  fails with reason badarg if heart is not
       started.

DATA TYPES

       heart_option() = check_schedulers

EXPORTS

       set_cmd(Cmd) -> ok | {error, {bad_cmd, Cmd}}

              Types:

                 Cmd = string()

              Sets a temporary reboot command. This command is used if a HEART_COMMAND other than
              the  one  specified with the environment variable should be used in order to reboot
              the system. The  new  Erlang  runtime  system  will  (if  it  misbehaves)  use  the
              environment variable HEART_COMMAND to reboot.

              Limitations:  The  Cmd  command  string will be sent to the heart program as a ISO-
              latin-1 or UTF-8 encoded binary depending on the file name  encoding  mode  of  the
              emulator  (see file:native_name_encoding/0). The size of the encoded binary must be
              less than 2047 bytes.

       clear_cmd() -> ok

              Clears  the  temporary  boot  command.  If  the  system  terminates,   the   normal
              HEART_COMMAND is used to reboot.

       get_cmd() -> {ok, Cmd}

              Types:

                 Cmd = string()

              Get  the temporary reboot command. If the command is cleared, the empty string will
              be returned.

       set_callback(Module, Function) ->
                       ok | {error, {bad_callback, {Module, Function}}}

              Types:

                 Module = Function = atom()

              This validation callback will be executed before any heartbeat  sent  to  the  port
              program. For the validation to succeed it needs to return with the value ok.

              An exception within the callback will be treated as a validation failure.

              The callback will be removed if the system reboots.

       clear_callback() -> ok

              Removes the validation callback call before heartbeats.

       get_callback() -> {ok, {Module, Function}} | none

              Types:

                 Module = Function = atom()

              Get the validation callback. If the callback is cleared, none will be returned.

       set_options(Options) -> ok | {error, {bad_options, Options}}

              Types:

                 Options = [heart_option()]

              Valid options set_options are:

                check_schedulers:
                  If   enabled,   a   signal  will  be  sent  to  each  scheduler  to  check  its
                  responsiveness. The system check occurs before any heartbeat sent to  the  port
                  program.  If  any scheduler is not responsive enough the heart program will not
                  receive its heartbeat and thus eventually terminate the node.

              Returns with the value ok if the options are valid.

       get_options() -> {ok, Options} | none

              Types:

                 Options = [atom()]

              Returns {ok, Options} where Options is a list of current options enabled for heart.
              If the callback is cleared, none will be returned.