Provided by: gromacs-data_2020.1-1_all bug

NAME

       gmx-tune_pme - Time mdrun as a function of PME ranks to optimize settings

SYNOPSIS

          gmx tune_pme [-s [<.tpr>]] [-cpi [<.cpt>]] [-table [<.xvg>]]
                       [-tablep [<.xvg>]] [-tableb [<.xvg>]]
                       [-rerun [<.xtc/.trr/...>]] [-ei [<.edi>]] [-p [<.out>]]
                       [-err [<.log>]] [-so [<.tpr>]] [-o [<.trr/.cpt/...>]]
                       [-x [<.xtc/.tng>]] [-cpo [<.cpt>]]
                       [-c [<.gro/.g96/...>]] [-e [<.edr>]] [-g [<.log>]]
                       [-dhdl [<.xvg>]] [-field [<.xvg>]] [-tpi [<.xvg>]]
                       [-tpid [<.xvg>]] [-eo [<.xvg>]] [-px [<.xvg>]]
                       [-pf [<.xvg>]] [-ro [<.xvg>]] [-ra [<.log>]]
                       [-rs [<.log>]] [-rt [<.log>]] [-mtx [<.mtx>]]
                       [-swap [<.xvg>]] [-bo [<.trr/.cpt/...>]] [-bx [<.xtc>]]
                       [-bcpo [<.cpt>]] [-bc [<.gro/.g96/...>]] [-be [<.edr>]]
                       [-bg [<.log>]] [-beo [<.xvg>]] [-bdhdl [<.xvg>]]
                       [-bfield [<.xvg>]] [-btpi [<.xvg>]] [-btpid [<.xvg>]]
                       [-bdevout [<.xvg>]] [-brunav [<.xvg>]] [-bpx [<.xvg>]]
                       [-bpf [<.xvg>]] [-bro [<.xvg>]] [-bra [<.log>]]
                       [-brs [<.log>]] [-brt [<.log>]] [-bmtx [<.mtx>]]
                       [-bdn [<.ndx>]] [-bswap [<.xvg>]] [-xvg <enum>]
                       [-mdrun <string>] [-np <int>] [-npstring <enum>]
                       [-ntmpi <int>] [-r <int>] [-max <real>] [-min <real>]
                       [-npme <enum>] [-fix <int>] [-rmax <real>]
                       [-rmin <real>] [-[no]scalevdw] [-ntpr <int>]
                       [-steps <int>] [-resetstep <int>] [-nsteps <int>]
                       [-[no]launch] [-[no]bench] [-[no]check]
                       [-gpu_id <string>] [-[no]append] [-[no]cpnum]
                       [-deffnm <string>]

DESCRIPTION

       For  a  given  number  -np  or  -ntmpi of ranks, gmx tune_pme systematically times gmx mdrun with various
       numbers of PME-only ranks and determines which setting is fastest. It will also test whether  performance
       can  be  enhanced  by  shifting load from the reciprocal to the real space part of the Ewald sum.  Simply
       pass your .tpr file to gmx tune_pme together with other options for gmx mdrun as needed.

       gmx tune_pme needs to call gmx mdrun and so requires that you specify how to call mdrun with the argument
       to the -mdrun parameter. Depending how you have built GROMACS, values such as 'gmx mdrun', 'gmx_d mdrun',
       or 'mdrun_mpi' might be needed.

       The program that runs MPI programs can be set in the environment variable MPIRUN (defaults to  'mpirun').
       Note that for certain MPI frameworks, you need to provide a machine- or hostfile. This can also be passed
       via the MPIRUN variable, e.g.

       export MPIRUN="/usr/local/mpirun -machinefile hosts" Note that in such cases it is normally necessary  to
       compile and/or run gmx tune_pme without MPI support, so that it can call the MPIRUN program.

       Before  doing  the  actual  benchmark runs, gmx tune_pme will do a quick check whether gmx mdrun works as
       expected with the provided parallel settings if the -check option is  activated  (the  default).   Please
       call gmx tune_pme with the normal options you would pass to gmx mdrun and add -np for the number of ranks
       to perform the tests on, or -ntmpi for the number of threads. You can also add -r  to  repeat  each  test
       several times to get better statistics.

       gmx tune_pme can test various real space / reciprocal space workloads for you. With -ntpr you control how
       many extra .tpr files will be written with enlarged  cutoffs  and  smaller  Fourier  grids  respectively.
       Typically,  the  first  test (number 0) will be with the settings from the input .tpr file; the last test
       (number ntpr) will have the Coulomb cutoff specified by -rmax with a somewhat smaller  PME  grid  at  the
       same  time.  In this last test, the Fourier spacing is multiplied with rmax/rcoulomb.  The remaining .tpr
       files will have equally-spaced Coulomb radii (and Fourier spacings) between these extremes. Note that you
       can  set  -ntpr  to 1 if you just seek the optimal number of PME-only ranks; in that case your input .tpr
       file will remain unchanged.

       For the benchmark runs, the default of 1000 time steps should suffice for most MD  systems.  The  dynamic
       load  balancing  needs  about  100  time steps to adapt to local load imbalances, therefore the time step
       counters are by default reset after 100 steps. For large systems (>1M atoms), as well  as  for  a  higher
       accuracy  of the measurements, you should set -resetstep to a higher value.  From the 'DD' load imbalance
       entries in the md.log output file you can tell after how many steps the load  is  sufficiently  balanced.
       Example call:

       gmx tune_pme -np 64 -s protein.tpr -launch

       After  calling  gmx mdrun several times, detailed performance information is available in the output file
       perf.out.  Note that during the benchmarks, a couple of temporary files are written (options -b*),  these
       will be automatically deleted after each test.

       If  you  want  the  simulation to be started automatically with the optimized parameters, use the command
       line option -launch.

       Basic support for GPU-enabled mdrun exists. Give a string containing the IDs of the GPUs that you wish to
       use in the optimization in the -gpu_id command-line argument. This works exactly like mdrun -gpu_id, does
       not imply a mapping, and merely declares the eligible set of GPU  devices.  gmx-tune_pme  will  construct
       calls to mdrun that use this set appropriately. gmx-tune_pme does not support -gputasks.

OPTIONS

       Options to specify input files:

       -s [<.tpr>] (topol.tpr)
              Portable xdr run input file

       -cpi [<.cpt>] (state.cpt) (Optional)
              Checkpoint file

       -table [<.xvg>] (table.xvg) (Optional)
              xvgr/xmgr file

       -tablep [<.xvg>] (tablep.xvg) (Optional)
              xvgr/xmgr file

       -tableb [<.xvg>] (table.xvg) (Optional)
              xvgr/xmgr file

       -rerun [<.xtc/.trr/...>] (rerun.xtc) (Optional)
              Trajectory: xtc trr cpt gro g96 pdb tng

       -ei [<.edi>] (sam.edi) (Optional)
              ED sampling input

       Options to specify output files:

       -p [<.out>] (perf.out)
              Generic output file

       -err [<.log>] (bencherr.log)
              Log file

       -so [<.tpr>] (tuned.tpr)
              Portable xdr run input file

       -o [<.trr/.cpt/...>] (traj.trr)
              Full precision trajectory: trr cpt tng

       -x [<.xtc/.tng>] (traj_comp.xtc) (Optional)
              Compressed trajectory (tng format or portable xdr format)

       -cpo [<.cpt>] (state.cpt) (Optional)
              Checkpoint file

       -c [<.gro/.g96/...>] (confout.gro)
              Structure file: gro g96 pdb brk ent esp

       -e [<.edr>] (ener.edr)
              Energy file

       -g [<.log>] (md.log)
              Log file

       -dhdl [<.xvg>] (dhdl.xvg) (Optional)
              xvgr/xmgr file

       -field [<.xvg>] (field.xvg) (Optional)
              xvgr/xmgr file

       -tpi [<.xvg>] (tpi.xvg) (Optional)
              xvgr/xmgr file

       -tpid [<.xvg>] (tpidist.xvg) (Optional)
              xvgr/xmgr file

       -eo [<.xvg>] (edsam.xvg) (Optional)
              xvgr/xmgr file

       -px [<.xvg>] (pullx.xvg) (Optional)
              xvgr/xmgr file

       -pf [<.xvg>] (pullf.xvg) (Optional)
              xvgr/xmgr file

       -ro [<.xvg>] (rotation.xvg) (Optional)
              xvgr/xmgr file

       -ra [<.log>] (rotangles.log) (Optional)
              Log file

       -rs [<.log>] (rotslabs.log) (Optional)
              Log file

       -rt [<.log>] (rottorque.log) (Optional)
              Log file

       -mtx [<.mtx>] (nm.mtx) (Optional)
              Hessian matrix

       -swap [<.xvg>] (swapions.xvg) (Optional)
              xvgr/xmgr file

       -bo [<.trr/.cpt/...>] (bench.trr)
              Full precision trajectory: trr cpt tng

       -bx [<.xtc>] (bench.xtc)
              Compressed trajectory (portable xdr format): xtc

       -bcpo [<.cpt>] (bench.cpt)
              Checkpoint file

       -bc [<.gro/.g96/...>] (bench.gro)
              Structure file: gro g96 pdb brk ent esp

       -be [<.edr>] (bench.edr)
              Energy file

       -bg [<.log>] (bench.log)
              Log file

       -beo [<.xvg>] (benchedo.xvg) (Optional)
              xvgr/xmgr file

       -bdhdl [<.xvg>] (benchdhdl.xvg) (Optional)
              xvgr/xmgr file

       -bfield [<.xvg>] (benchfld.xvg) (Optional)
              xvgr/xmgr file

       -btpi [<.xvg>] (benchtpi.xvg) (Optional)
              xvgr/xmgr file

       -btpid [<.xvg>] (benchtpid.xvg) (Optional)
              xvgr/xmgr file

       -bdevout [<.xvg>] (benchdev.xvg) (Optional)
              xvgr/xmgr file

       -brunav [<.xvg>] (benchrnav.xvg) (Optional)
              xvgr/xmgr file

       -bpx [<.xvg>] (benchpx.xvg) (Optional)
              xvgr/xmgr file

       -bpf [<.xvg>] (benchpf.xvg) (Optional)
              xvgr/xmgr file

       -bro [<.xvg>] (benchrot.xvg) (Optional)
              xvgr/xmgr file

       -bra [<.log>] (benchrota.log) (Optional)
              Log file

       -brs [<.log>] (benchrots.log) (Optional)
              Log file

       -brt [<.log>] (benchrott.log) (Optional)
              Log file

       -bmtx [<.mtx>] (benchn.mtx) (Optional)
              Hessian matrix

       -bdn [<.ndx>] (bench.ndx) (Optional)
              Index file

       -bswap [<.xvg>] (benchswp.xvg) (Optional)
              xvgr/xmgr file

       Other options:

       -xvg <enum> (xmgrace)
              xvg plot formatting: xmgrace, xmgr, none

       -mdrun <string>
              Command line to run a simulation, e.g. 'gmx mdrun' or 'mdrun_mpi'

       -np <int> (1)
              Number of ranks to run the tests on (must be > 2 for separate PME ranks)

       -npstring <enum> (np)
              Name  of the $MPIRUN option that specifies the number of ranks to use ('np', or 'n'; use 'none' if
              there is no such option): np, n, none

       -ntmpi <int> (1)
              Number of MPI-threads to run the tests on (turns MPI & mpirun off)

       -r <int> (2)
              Repeat each test this often

       -max <real> (0.5)
              Max fraction of PME ranks to test with

       -min <real> (0.25)
              Min fraction of PME ranks to test with

       -npme <enum> (auto)
              Within -min and -max, benchmark all possible values for -npme, or just a reasonable  subset.  Auto
              neglects  -min  and  -max  and  chooses reasonable values around a guess for npme derived from the
              .tpr: auto, all, subset

       -fix <int> (-2)
              If >= -1, do not vary the number of PME-only ranks, instead use this fixed  value  and  only  vary
              rcoulomb and the PME grid spacing.

       -rmax <real> (0)
              If >0, maximal rcoulomb for -ntpr>1 (rcoulomb upscaling results in fourier grid downscaling)

       -rmin <real> (0)
              If >0, minimal rcoulomb for -ntpr>1

       -[no]scalevdw (yes)
              Scale rvdw along with rcoulomb

       -ntpr <int> (0)
              Number  of .tpr files to benchmark. Create this many files with different rcoulomb scaling factors
              depending on -rmin and -rmax. If < 1, automatically choose the number of .tpr files to test

       -steps <int> (1000)
              Take timings for this many steps in the benchmark runs

       -resetstep <int> (1500)
              Let dlb equilibrate this many steps before timings are taken (reset cycle counters after this many
              steps)

       -nsteps <int> (-1)
              If  non-negative,  perform  this many steps in the real run (overwrites nsteps from .tpr, add .cpt
              steps)

       -[no]launch (no)
              Launch the real simulation after optimization

       -[no]bench (yes)
              Run the benchmarks or just create the input .tpr files?

       -[no]check (yes)
              Before the benchmark runs, check whether mdrun works in parallel

       -gpu_id <string>
              List of unique GPU device IDs that are eligible for use

       -[no]append (yes)
              Append to previous output files when continuing from checkpoint instead of adding  the  simulation
              part number to all file names (for launch only)

       -[no]cpnum (no)
              Keep and number checkpoint files (launch only)

       -deffnm <string>
              Set the default filenames (launch only)

SEE ALSO

       gmx(1)

       More information about GROMACS is available at <http://www.gromacs.org/>.

COPYRIGHT

       2020, GROMACS development team