Provided by: gromacs-data_4.6.5-1build1_all bug

NAME

       g_tune_pme - time mdrun as a function of PME nodes to optimize settings

       VERSION 4.6.5

SYNOPSIS

       g_tune_pme  -p  perf.out  -err  bencherr.log  -so  tuned.tpr  -s  topol.tpr -o traj.trr -x
       traj.xtc -cpi state.cpt -cpo state.cpt -c confout.gro -e ener.edr -g md.log -dhdl dhdl.xvg
       -field   field.xvg  -table  table.xvg  -tabletf  tabletf.xvg  -tablep  tablep.xvg  -tableb
       table.xvg -rerun rerun.xtc -tpi tpi.xvg -tpid tpidist.xvg -ei  sam.edi  -eo  edsam.xvg  -j
       wham.gct  -jo bam.gct -ffout gct.xvg -devout deviatie.xvg -runav runaver.xvg -px pullx.xvg
       -pf pullf.xvg -ro rotation.xvg -ra rotangles.log -rs rotslabs.log -rt  rottorque.log  -mtx
       nm.mtx  -dn  dipole.ndx  -bo  bench.trr  -bx  bench.xtc  -bcpo bench.cpt -bc bench.gro -be
       bench.edr -bg bench.log -beo benchedo.xvg -bdhdl benchdhdl.xvg -bfield benchfld.xvg  -btpi
       benchtpi.xvg   -btpid   benchtpid.xvg   -bjo   bench.gct   -bffout  benchgct.xvg  -bdevout
       benchdev.xvg -brunav benchrnav.xvg -bpx benchpx.xvg  -bpf  benchpf.xvg  -bro  benchrot.xvg
       -bra  benchrota.log  -brs benchrots.log -brt benchrott.log -bmtx benchn.mtx -bdn bench.ndx
       -[no]h -[no]version -nice int -xvg enum -np int -npstring enum -ntmpi int -r int -max real
       -min  real  -npme  enum -fix int -rmax real -rmin real -[no]scalevdw -ntpr int -steps step
       -resetstep int -simsteps step -[no]launch -[no]bench -[no]append -[no]cpnum

DESCRIPTION

       For a given number  -np or  -ntmpi  of  processors/threads,  this  program  systematically
       times   mdrun  with  various  numbers  of  PME-only  nodes and determines which setting is
       fastest. It will also test whether performance can be enhanced by shifting load  from  the
       reciprocal  to  the  real  space  part  of  the Ewald sum.  Simply pass your  .tpr file to
       g_tune_pme together with other options for  mdrun as needed.

       Which executables are used can be set in the environment variables MPIRUN  and  MDRUN.  If
       these  are  not  present,  'mpirun'  and  'mdrun'  will be used as defaults. Note that for
       certain MPI frameworks you need to provide a machine- or hostfile. This can also be passed
       via the MPIRUN variable, e.g.

        export MPIRUN="/usr/local/mpirun -machinefile hosts"

       Please  call  g_tune_pme with the normal options you would pass to  mdrun and add  -np for
       the number of processors to perform the tests on, or  -ntmpi for the  number  of  threads.
       You can also add  -r to repeat each test several times to get better statistics.

         g_tune_pme can test various real space / reciprocal space workloads for you. With  -ntpr
       you control how many extra  .tpr files will be written with enlarged cutoffs  and  smaller
       Fourier  grids  respectively.   Typically,  the  first  test  (number  0) will be with the
       settings from the input  .tpr file; the last test (number  ntpr)  will  have  the  Coulomb
       cutoff specified by  -rmax with a somwhat smaller PME grid at the same time.  In this last
       test, the Fourier spacing is multiplied with  rmax/rcoulomb.  The  remaining   .tpr  files
       will  have  equally-spaced  Coulomb  radii  (and Fourier spacings) between these extremes.
       Note that you can set  -ntpr to 1 if you just seek the optimal number of  PME-only  nodes;
       in that case your input  .tpr file will remain unchanged.

       For the benchmark runs, the default of 1000 time steps should suffice for most MD systems.
       The dynamic load balancing needs about 100 time steps to adapt to local  load  imbalances,
       therefore  the  time step counters are by default reset after 100 steps. For large systems
       (1M atoms), as well as  for  a  higher  accuarcy  of  the  measurements,  you  should  set
       -resetstep  to  a higher value.  From the 'DD' load imbalance entries in the md.log output
       file you can tell after how many steps the load is sufficiently balanced. Example call:

        g_tune_pme -np 64 -s protein.tpr -launch

       After calling  mdrun several times, detailed performance information is available  in  the
       output  file  perf.out.   Note that during the benchmarks, a couple of temporary files are
       written (options  -b*), these will be automatically deleted after each test.

       If you want the simulation to be started automatically with the optimized parameters,  use
       the command line option  -launch.

FILES

       -p perf.out Output
        Generic output file

       -err bencherr.log Output
        Log file

       -so tuned.tpr Output
        Run input file: tpr tpb tpa

       -s topol.tpr Input
        Run input file: tpr tpb tpa

       -o traj.trr Output
        Full precision trajectory: trr trj cpt

       -x traj.xtc Output, Opt.
        Compressed trajectory (portable xdr format)

       -cpi state.cpt Input, Opt.
        Checkpoint file

       -cpo state.cpt Output, Opt.
        Checkpoint file

       -c confout.gro Output
        Structure file: gro g96 pdb etc.

       -e ener.edr Output
        Energy file

       -g md.log Output
        Log file

       -dhdl dhdl.xvg Output, Opt.
        xvgr/xmgr file

       -field field.xvg Output, Opt.
        xvgr/xmgr file

       -table table.xvg Input, Opt.
        xvgr/xmgr file

       -tabletf tabletf.xvg Input, Opt.
        xvgr/xmgr file

       -tablep tablep.xvg Input, Opt.
        xvgr/xmgr file

       -tableb table.xvg Input, Opt.
        xvgr/xmgr file

       -rerun rerun.xtc Input, Opt.
        Trajectory: xtc trr trj gro g96 pdb cpt

       -tpi tpi.xvg Output, Opt.
        xvgr/xmgr file

       -tpid tpidist.xvg Output, Opt.
        xvgr/xmgr file

       -ei sam.edi Input, Opt.
        ED sampling input

       -eo edsam.xvg Output, Opt.
        xvgr/xmgr file

       -j wham.gct Input, Opt.
        General coupling stuff

       -jo bam.gct Output, Opt.
        General coupling stuff

       -ffout gct.xvg Output, Opt.
        xvgr/xmgr file

       -devout deviatie.xvg Output, Opt.
        xvgr/xmgr file

       -runav runaver.xvg Output, Opt.
        xvgr/xmgr file

       -px pullx.xvg Output, Opt.
        xvgr/xmgr file

       -pf pullf.xvg Output, Opt.
        xvgr/xmgr file

       -ro rotation.xvg Output, Opt.
        xvgr/xmgr file

       -ra rotangles.log Output, Opt.
        Log file

       -rs rotslabs.log Output, Opt.
        Log file

       -rt rottorque.log Output, Opt.
        Log file

       -mtx nm.mtx Output, Opt.
        Hessian matrix

       -dn dipole.ndx Output, Opt.
        Index file

       -bo bench.trr Output
        Full precision trajectory: trr trj cpt

       -bx bench.xtc Output
        Compressed trajectory (portable xdr format)

       -bcpo bench.cpt Output
        Checkpoint file

       -bc bench.gro Output
        Structure file: gro g96 pdb etc.

       -be bench.edr Output
        Energy file

       -bg bench.log Output
        Log file

       -beo benchedo.xvg Output, Opt.
        xvgr/xmgr file

       -bdhdl benchdhdl.xvg Output, Opt.
        xvgr/xmgr file

       -bfield benchfld.xvg Output, Opt.
        xvgr/xmgr file

       -btpi benchtpi.xvg Output, Opt.
        xvgr/xmgr file

       -btpid benchtpid.xvg Output, Opt.
        xvgr/xmgr file

       -bjo bench.gct Output, Opt.
        General coupling stuff

       -bffout benchgct.xvg Output, Opt.
        xvgr/xmgr file

       -bdevout benchdev.xvg Output, Opt.
        xvgr/xmgr file

       -brunav benchrnav.xvg Output, Opt.
        xvgr/xmgr file

       -bpx benchpx.xvg Output, Opt.
        xvgr/xmgr file

       -bpf benchpf.xvg Output, Opt.
        xvgr/xmgr file

       -bro benchrot.xvg Output, Opt.
        xvgr/xmgr file

       -bra benchrota.log Output, Opt.
        Log file

       -brs benchrots.log Output, Opt.
        Log file

       -brt benchrott.log Output, Opt.
        Log file

       -bmtx benchn.mtx Output, Opt.
        Hessian matrix

       -bdn bench.ndx Output, Opt.
        Index file

OTHER OPTIONS

       -[no]hno
        Print help info and quit

       -[no]versionno
        Print version info and quit

       -nice int 0
        Set the nicelevel

       -xvg enum xmgrace
        xvg plot formatting:  xmgrace,  xmgr or  none

       -np int 1
        Number of nodes to run the tests on (must be  2 for separate PME nodes)

       -npstring enum -np
        Specify the number of processors to  $MPIRUN using this string:  -np,  -n or  none

       -ntmpi int 1
        Number of MPI-threads to run the tests on (turns MPI & mpirun off)

       -r int 2
        Repeat each test this often

       -max real 0.5
        Max fraction of PME nodes to test with

       -min real 0.25
        Min fraction of PME nodes to test with

       -npme enum auto
        Within  -min  and  -max,  benchmark  all possible values for  -npme, or just a reasonable
       subset. Auto neglects -min and -max and chooses reasonable values around a guess for  npme
       derived from the .tpr:  auto,  all or  subset

       -fix int -2
        If  = -1, do not vary the number of PME-only nodes, instead use this fixed value and only
       vary rcoulomb and the PME grid spacing.

       -rmax real 0
        If  0,  maximal  rcoulomb  for  -ntpr1  (rcoulomb  upscaling  results  in  fourier   grid
       downscaling)

       -rmin real 0
        If 0, minimal rcoulomb for -ntpr1

       -[no]scalevdwyes
        Scale rvdw along with rcoulomb

       -ntpr int 0
        Number  of   .tpr  files  to  benchmark.  Create  this many files with different rcoulomb
       scaling factors depending on -rmin and -rmax. If  1, automatically choose  the  number  of
       .tpr files to test

       -steps step 1000
        Take timings for this many steps in the benchmark runs

       -resetstep int 100
        Let  dlb equilibrate this many steps before timings are taken (reset cycle counters after
       this many steps)

       -simsteps step -1
        If non-negative, perform this many steps in the real run (overwrites nsteps  from   .tpr,
       add  .cpt steps)

       -[no]launchno
        Launch the real simulation after optimization

       -[no]benchyes
        Run the benchmarks or just create the input  .tpr files?

       -[no]appendyes
        Append  to  previous  output  files when continuing from checkpoint instead of adding the
       simulation part number to all file names (for launch only)

       -[no]cpnumno
        Keep and number checkpoint files (launch only)

SEE ALSO

       gromacs(7)

       More information about GROMACS is available at <http://www.gromacs.org/>.

                                          Mon 2 Dec 2013                            g_tune_pme(1)