bionic (5) bhost.5.gz

Provided by: lam-runtime_7.1.4-3.1build1_amd64 bug

NAME

       bhost - LAM boot schema (host file) format

SYNTAX

       #
       # comments
       #
       <machine> [cpu=<cpucount>] [user=<userid>]
       <machine> [cpu=<cpucount>] [user=<userid>]
        ...

DESCRIPTION

       A  boot  schema describes the machines that will combine to form a multicomputer running LAM.  It is used
       by recon(1) to verify initial conditions for running LAM, by lamboot(1) to start LAM, and  by  lamhalt(1)
       to terminate LAM (note that lamwipe(1) has been deprecated by the lamhalt(1) command).

       The  particular  syntax  of  a  LAM  boot  schema is sometimes called the "host file" syntax.  It is line
       oriented.  One line indicates the name of a machine, typically the full Internet domain name, an optional
       number of CPUs available on that machine, and optionally the userid with which to access it.

       Common  boot  schema  for  a particular site may be created by the system administrator and placed in the
       installation directory under etc/.  They typically start with the prefix bhost.  Individual users usually
       create their own boot schema, especially if the configurations are simple.

NAME RESOLUTION

       Note  that  lamboot  resolves all names listed in bhost on the node in which lamboot was invoked on.  The
       lamboot(1) man page contains information about address resolution, examples on  how  to  handle  multiple
       network interface cards (NICs) in a node, etc.

EXAMPLE

       Here is an example three node boot schema:

       #
       # example LAM host file
       #
       server.cluster.example.com schedule=no
       beowulf1.cluster.example.com cpu=2
       beowulf2.cluster.example.com
       beowulf2.cluster.example.com
       somewhere.else.example.com user=guest

       Note   that   the   "guest"   ID   is   significant,  since  the  user  has  an  alternate  login  ID  on
       somewhere.else.example.com.  Additionally note that beowulf1 has a CPU count of 2 listed (a CPU count  of
       1  is  assumed  if  it  is  not  given).   This  value  is  used  by  mpirun(1),  MPI_Comm_spawn(2),  and
       MPI_Comm_spawn_multiple(2) for the "C" (or CPU) notation that specifies how many ranks to start.  This is
       particularly useful for running on SMP machines.

       Note  the  schedule=no clause.  This means that LAM will boot a daemon on that node, but by default, will
       not launch any MPI processes on that node.  This  is  handy  for  when  you  want  to  control  your  MPI
       applications  from  one node (e.g., a server), but don't want to run any MPI applications on it.  In some
       environments this is the default (e.g., BProc).  See the LAM User's Guide for more details.

       beowulf2 is listed twice, but has no specific CPU count listed.  In this case, LAM will  keep  a  running
       tally  of  the  total number of CPUs for that host.  Hence, LAM will calculate that beowulf2 has two CPUs
       available for use.  Calculating the number of CPUs by counting occurances of a hostname is  useful  in  a
       batch  environment  where a hostfile may list the same hostname multiple times, indicating that the batch
       scheduler has allocated multiple CPUs for a single job (e.g., PBS operates this way).

       For the above-mentioned schema, the command "mpirun C foo" would start five instances of the foo program;
       two on beowulf1, two on beowulf2, and one on somewhere.else.

FILES

       $LAMHOME/etc/bhost.def            default boot schema file

SEE ALSO

       LAM  User's  Guide,  lamboot(1),  lamhalt(1),  mpirun(1),  MPI_Comm_spawn(1), MPI_Comm_spawn_multiple(1),
       recon(1), lamwipe(1)