Provided by: dieharder_3.31.1-7_i386 bug


       dieharder   -  A  testing  and  benchmarking  tool  for  random  number


       dieharder [-a] [-d dieharder test number] [-f filename] [-B]
                 [-D output flag [-D output flag] ... ] [-F] [-c separator]
                 [-g generator number or -1] [-h] [-k ks_flag] [-l]
                 [-L overlap] [-m multiply_p] [-n ntuple]
                 [-p number of p samples] [-P Xoff]
                 [-o filename] [-s seed strategy] [-S random number seed]
                 [-n ntuple] [-p number of p samples] [-o filename]
                 [-s seed strategy] [-S random number seed]
                 [-t number of test samples] [-v verbose flag]
                 [-W weak] [-X fail] [-Y Xtrategy]
                 [-x xvalue] [-y yvalue] [-z zvalue]

dieharder OPTIONS

       -a runs all the tests with standard/default options to create a
              user-controllable report.  To  control  the  formatting  of  the
              report,  see  -D below.  To control the power of the test (which
              uses default values for tsamples that cannot generally be varied
              and psamples which generally can) see -m below as a "multiplier"
              of the default number of psamples (used only in a -a run).

       -d test number -  selects specific diehard test.

       -f filename - generators 201 or 202 permit either raw binary or
              formatted ASCII numbers to be read in from a file  for  testing.
              generator  200  reads  in  raw  binary numbers from stdin.  Note
              well: many tests with default parameters require a lot of rands!
              To  see  a  sample  of the (required) header for ASCII formatted
              input, run

                       dieharder -o -f example.input -t 10

              and then examine the  contents  of  example.input.   Raw  binary
              input  reads  32  bit  increments  of the specified data stream.
              stdin_input_raw accepts a pipe from a raw binary stream.

       -B binary mode (used with -o below) causes output rands to  be  written
       in raw binary, not formatted ascii.

       -D output flag - permits fields to be selected for inclusion in
              dieharder  output.   Each flag can be entered as a binary number
              that turns on a specific output field or header or by flag name;
              flags  are aggregated.  To see all currently known flags use the
              -F command.

       -F - lists all known flags by name and number.

       -c table separator  -  where  separator  is  e.g.  ','  (CSV)  or  '  '

       -g generator number - selects a specific generator for testing.  Using
              -g  -1  causes  all  known  generators  to be printed out to the

       -h prints context-sensitive help -- usually Usage (this message) or a
              test synopsis if entered as e.g. dieharder -d 3 -h.

       -k ks_flag - ks_flag

              0 is fast but slightly sloppy for psamples > 4999 (default).

              1 is MUCH  slower  but  more  accurate  for  larger  numbers  of

              2  is  slower still, but (we hope) accurate to machine precision
              for any number of psamples up to some as yet  unknown  numerical
              upper  limit  (it  has  been  tested out to at least hundreds of

              3 is kuiper  ks,  fast,  quite  inaccurate  for  small  samples,

       -l list all known tests.

       -L overlap

              1 (use overlap, default)

              0 (don't use overlap)

              in  operm5  or  other  tests  that  support overlapping and non-
              overlapping sample modes.

       -m multiply_p - multiply default # of psamples in -a(ll) runs to crank
              up the resolution of failure.  -n ntuple - set ntuple length for
              tests  on  short bit strings that permit the length to be varied
              (e.g. rgb bitdist).

       -o filename - output -t count random numbers from current generator  to

       -p count - sets the number of p-value samples per test (default 100).

       -P  Xoff  -  sets  the  number  of  psamples  that will cumulate before
              that a generator is "good" and really, truly passes even a -Y  2
              T2D run.  Currently the default is 100000; eventually it will be
              set from AES-derived  T2D  test  failure  thresholds  for  fully
              automated reliable operation, but for now it is more a "boredom"
              threshold set by how long one might reasonably want to  wait  on
              any given test run.

       -S seed - where seed is a uint.  Overrides the default random seed
              selection.  Ignored for file or stdin input.

       -s strategy - if strategy is the (default) 0, dieharder reseeds (or
              rewinds)  once at the beginning when the random number generator
              is selected and then never again.  If strategy is  nonzero,  the
              generator  is reseeded or rewound at the beginning of EACH TEST.
              If -S seed was specified, or a file is used,  this  means  every
              test  is  applied  to  the  same  sequence  (which is useful for
              validation and testing of dieharder, but not a good way to  test
              rngs).  Otherwise a new random seed is selected for each test.

       -t count - sets the number of random entities used in each test, where
              possible.   Be  warned  --  some  tests have fixed sample sizes;
              others are variable but have practical  minimum  sizes.   It  is
              suggested  you  begin  with the values used in -a and experiment
              carefully on a test by test basis.

       -W weak - sets the "weak" threshold to make the test(s) more or less
              forgiving during e.g. a  test-to-destruction  run.   Default  is
              currently 0.005.

       -X fail - sets the "fail" threshold to make the test(s) more or less
              forgiving  during  e.g.  a  test-to-destruction run.  Default is
              currently 0.000001, which is basically "certain failure  of  the
              null  hypothesis",  the  desired  mode of reproducible generator

       -Y Xtrategy - the Xtrategy flag controls  the  new  "test  to  failure"
              modes.  These flags and their modes act as follows:

                0  -  just run dieharder with the specified number of tsamples
              and psamples, do not dynamically modify a run based on  results.
              This is the way it has always run, and is the default.

                1  - "resolve ambiguity" (RA) mode.  If a test returns "weak",
              this is an undesired result.  What does that  mean,  after  all?
              If  you  run  a  long  test series, you will see occasional weak
              returns  for  a  perfect  generators  because  p  is   uniformly
              distributed  and will appear in any finite interval from time to
              time.  Even if a test run returns more than one weak result, you
              cannot  be  certain that the generator is failing.  RA mode adds
              psamples (usually in blocks of 100) until the test  result  ends
              up solidly not weak or proceeds to unambiguous failure.  This is
              morally equivalent to running the test several times to see if a
              weak result is reproducible, but eliminates the bias of personal
              judgement in the process since the default failure threshold  is
              very small and very unlikely to be reached by random chance even
              in many runs.

              This option should only be used with -k 2.

                2 - "test to destruction" mode.  Sometimes you  just  want  to
              know  where  or if a generator will .I ever fail a test (or test
              series).  -Y 2 causes psamples to be added 100 at a time until a
              test  returns an overall pvalue lower than the failure threshold
              or a specified maximum number of psamples (see -P) is reached.

              Note well!  In this mode one may well fail due to the  alternate
              null  hypothesis  --  the  test  itself is a bad test and fails!
              Many dieharder tests, despite our best efforts, are  numerically
              unstable  or  have only approximately known target statistics or
              are straight up asymptotic results, and will eventually return a
              failing result even for a gold-standard generator (such as AES),
              or for the hypercautious the XOR generator with AES,  threefish,
              kiss,  all  loaded  at once and xor'd together.  It is therefore
              safest to use this mode .I comparatively, executing a T2D run on
              AES to get an idea of the test failure threshold(s) (something I
              will eventually do and publish on the web so  everybody  doesn't
              have  to do it independently) and then running it on your target
              generator.  Failure with numbers of psamples within an order  of
              magnitude  of  the  AES thresholds should probably be considered
              possible test failures, not  generator  failures.   Failures  at
              levels significantly less than the known gold standard generator
              failure thresholds are, of  course,  probably  failures  of  the

              This option should only be used with -k 2.

       -v verbose flag -- controls the verbosity of the output for debugging
              only.   Probably of little use to non-developers, and developers
              can read the enum(s) in dieharder.h and the test sources to  see
              which flag values turn on output on which routines.  1 is result
              in a highly detailed trace of program activity.

       -x,-y,-z number - Some tests have parameters that can safely be varied
              from their default value.  For example, in the diehard birthdays
              test,  one  can  vary  the  number  of length, which can also be
              varied.  -x 2048 -y 30 alters these two values but should  still
              run  fine.   These  parameters  should  be documented internally
              (where they exist) in the e.g. -d 0 -h visible notes.

              NOTE WELL: The assessment(s) for  the  rngs  may,  in  fact,  be
              completely incorrect or misleading.  There are still "bad tests"
              in dieharder, although we are working to fix  and  improve  them
              (and  try to document them in the test descriptions visible with
              -g testnumber -h).  In particular, 'Weak' pvalues  should  occur
              one  test  in two hundred, and 'Failed' pvalues should occur one
              test in a million with the default thresholds -  that's  what  p
              MEANS.  Use them at your Own Risk!  Be Warned!

              Or  better  yet,  use the new -Y 1 and -Y 2 resolve ambiguity or
              test to destruction modes above, comparing to  similar  runs  on
              one  of  the as-good-as-it-gets cryptographic generators, AES or



       Welcome to the current snapshot of the dieharder random number  tester.
       It  encapsulates  all of the Gnu Scientific Library (GSL) random number
       generators (rngs) as  well  as  a  number  of  generators  from  the  R
       statistical  library,  hardware  sources  such  as  /dev/*random, "gold
       standard"  cryptographic  quality  generators   (useful   for   testing
       dieharder  and for purposes of comparison to new generators) as well as
       generators contributed by users or  found  in  the  literature  into  a
       single harness that can time them and subject them to various tests for
       randomness.  These tests are variously drawn  from  George  Marsaglia's
       "Diehard  battery  of  random  number tests", the NIST Statistical Test
       Suite, and again from other sources such as  personal  invention,  user
       contribution, other (open source) test suites, or the literature.

       The  primary  point  of  dieharder  is to make it easy to time and test
       (pseudo)random number generators, including both software and  hardware
       rngs,  with  a  fully  open  source  tool.   In  addition  to providing
       "instant" access to testing  of  all  built-in  generators,  users  can
       choose  one of three ways to test their own random number generators or
       sources:  a unix pipe of a raw binary (presumed  random)  bitstream;  a
       file  containing  a (presumed random) raw binary bitstream or formatted
       ascii uints or floats; and embedding your generator in dieharder's GSL-
       compatible   rng  harness  and  adding  it  to  the  list  of  built-in
       generators.  The stdin and file input methods are  described  below  in
       their  own  section,  as  is  suggested  "best practice" for newbies to
       random number generator testing.

       An important motivation for using dieharder is  that  the  entire  test
       suite  is  fully  Gnu  Public  License (GPL) open source code and hence
       rather than being prohibited from "looking  underneath  the  hood"  all
       users  are  openly  encouraged to critically examine the dieharder code
       for errors, add new tests or generators or user interfaces, or  use  it
       freely  as is to test their own favorite candidate rngs subject only to
       the constraints of the GPL.  As a result  of  its  openness,  literally
       hundreds  of  improvements and bug fixes have been contributed by users
       to date, resulting in a far stronger and more reliable test suite  than
       would  have  been  possible with closed and locked down sources or even
       open sources (such as STS) that lack the dynamical  feedback  mechanism
       permitting corrections to be shared.

       Even  small  errors  in test statistics permit the alternative (usually
       unstated) null hypothesis to become an important factor in rng  testing
       -- the unwelcome possibility that your generator is just fine but it is
       the test that is failing.  One extremely useful feature of dieharder is
       that  it  is  at  least  moderately  self  validating.  Using the "gold
       standard" aes and threefish cryptographic generators, you  can  observe
       how  these  generators  perform  on  dieharder runs to the same general
       degree of accuracy that you wish to  use  on  the  generators  you  are
       testing.   In  general,  dieharder  tests that consistently fail at any
       given level of precision (selected with e.g. -a -m 10) on both  of  the
       gold  standard  rngs (and/or the better GSL generators, mt19937, gfsr4,
       taus) are probably unreliable at that precision and it would hardly  be
       surprising if they failed your generator as well.

       Experts  in  statistics are encouraged to give the suite a try, perhaps
       using any of the example calls below at first and then using it  freely
       on  their  own  generators  or as a harness for adding their own tests.
       Novices (to either statistics or random number generator  testing)  are
       strongly  encouraged  to read the next section on p-values and the null
       hypothesis and running the test suite a few times with a  more  verbose
       output report to learn how the whole thing works.


       Examples  for  how  to  set  up  pipe  or  file  input are given below.
       However, it is recommended that a user play with some of the  built  in
       generators  to gain familiarity with dieharder reports and tests before
       tackling their own favorite generator or file full of  possibly  random

       To  see  dieharder's  default  standard  test  report  for  its default
       generator (mt19937) simply run:

          dieharder -a

       To increase the resolution of possible failures of the standard  -a(ll)
       test,  use  the -m "multiplier" for the test default numbers of pvalues
       (which are selected more to make a full test run take  an  hour  or  so
       instead  of  days than because it is truly an exhaustive test sequence)

          dieharder -a -m 10

       To test a different generator (say the gold  standard  AES_OFB)  simply
       specify the generator on the command line with a flag:

          dieharder -g 205 -a -m 10

       Arguments  can  be in any order.  The generator can also be selected by

          dieharder -g AES_OFB -a

       To apply only the diehard opso test to the AES_OFB  generator,  specify
       the test by name or number:

          dieharder -g 205 -d 5


          dieharder -g 205 -d diehard_opso

       Nearly  every  aspect  or  field in dieharder's output report format is
       user-selectable by means of display option  flags.   In  addition,  the
       field  separator  character  can  be  selected  by the user to make the
       output particularly easy for them to parse (-c ' ') or  import  into  a
       spreadsheet (-c ',').  Try:

          dieharder -g 205 -d diehard_opso -c ',' -D test_name -D pvalues

       to see an extremely terse, easy to import report or

          dieharder  -g  205 -d diehard_opso -c ' ' -D default -D histogram -D

       to see a verbose report good for a  "beginner"  that  includes  a  full
       description of each test itself.

       Finally, the dieharder binary is remarkably autodocumenting even if the
       man page is not available. All users should try the following  commands
       to see what they do:

          dieharder -h

       (prints the command synopsis like the one above).

          dieharder -a -h
          dieharder -d 6 -h

       (prints the test descriptions only for -a(ll) tests or for the specific
       test indicated).

          dieharder -l

       (lists all known tests, including how reliable rgb thinks that they are
       as things stand).

          dieharder -g -1

       (lists all known rngs).

          dieharder -F

       (lists  all  the currently known display/output control flags used with

       Both beginners and experts should be aware that the assessment provided
       by  dieharder  in  its  standard  report  should be regarded with great
       suspicion.  It is entirely possible for a generator to "pass" all tests
       as  far  as  their  individual  p-values  are concerned and yet to fail
       utterly when considering them all together.  Similarly, it is  probable
       that  a rng will at the very least show up as "weak" on 0, 1 or 2 tests
       in a typical -a(ll) run, and may even "fail" 1 test one such run in  10
       or  so.   To  understand  why this is so, it is necessary to understand
       something of rng testing, p-values, and the null hypothesis!


       dieharder returns "p-values".  To understand what a p-value is and  how
       to use it, it is essential to understand the null hypothesis, H0.

       The  null  hypothesis  for  random  number  generator  testing is "This
       generator is a perfect random number generator, and for any  choice  of
       seed  produces  a infinitely long, unique sequence of numbers that have
       all the expected statistical  properties  of  random  numbers,  to  all
       orders".   Note  well  that we know that this hypothesis is technically
       false for all software generators as they are periodic and do not  have
       the  correct  entropy  content  for  this  statement  to  ever be true.
       However, many hardware generators  fail  a  priori  as  well,  as  they
       contain  subtle  bias  or correlations due to the deterministic physics
       that underlies them.  Nature is often unpredictable but  it  is  rarely
       random and the two words don't (quite) mean the same thing!

       The  null  hypothesis  can be practically true, however.  Both software
       and hardware generators can be "random"  enough  that  their  sequences
       cannot  be  distinguished from random ones, at least not easily or with
       the available tools (including dieharder!) Hence the null hypothesis is
       a practical, not a theoretically pure, statement.

       To  test  H0  ,  one uses the rng in question to generate a sequence of
       presumably random numbers.  Using these numbers one  can  generate  any
       one  of a wide range of test statistics -- empirically computed numbers
       that are considered random samples that may or  may  not  be  covariant
       subject  to  H0,  depending  on whether overlapping sequences of random
       numbers are used to generate successive samples  while  generating  the
       statistic(s), drawn from a known distribution.  From a knowledge of the
       target distribution of the statistic(s) and the  associated  cumulative
       distribution  function  (CDF)  and  the empirical value of the randomly
       generated statistic(s), one can read off the probability  of  obtaining
       the  empirical result if the sequence was truly random, that is, if the
       null hypothesis is true and the  generator  in  question  is  a  "good"
       random  number  generator!   This  probability is the "p-value" for the
       particular test run.

       For example, to test a coin (or a sequence of  bits)  we  might  simply
       count the number of heads and tails in a very long string of flips.  If
       we assume that the coin is a "perfect coin", we expect  the  number  of
       heads and tails to be binomially distributed and can easily compute the
       probability of getting any particular number of heads and tails.  If we
       compare  our recorded number of heads and tails from the test series to
       this distribution and find that the probability of getting the count we
       obtained  is very low with, say, way more heads than tails we'd suspect
       the coin wasn't a perfect coin.  dieharder applies this very test (made
       mathematically  precise)  and  many  others  that  operate on this same
       principle to the string of random bits produced by the rng being tested
       to provide a picture of how "random" the rng is.

       Note  that  the  usual dogma is that if the p-value is low -- typically
       less than 0.05 -- one "rejects" the null hypothesis.  In a word, it  is
       improbable that one would get the result obtained if the generator is a
       good one.  If it  is  any  other  value,  one  does  not  "accept"  the
       generator  as good, one "fails to reject" the generator as bad for this
       particular test.  A "good random number generator" is hence one that we
       haven't been able to make fail yet!

       This  criterion  is, of course, naive in the extreme and cannot be used
       with dieharder!  It makes just as much sense to reject a generator that
       has p-values of 0.95 or more!  Both of these p-value ranges are equally
       unlikely on any given test run, and should be returned for (on average)
       5%  of all test runs by a perfect random number generator.  A generator
       that fails to produce p-values less than 0.05 5%  of  the  time  it  is
       tested  with different seeds is a bad random number generator, one that
       fails the test of the null hypothesis.  Since  dieharder  returns  over
       100  pvalues  by  default per test, one would expect any perfectly good
       rng to "fail" such a naive test around five times by this criterion  in
       a single dieharder run!

       The  p-values  themselves,  as  it  turns out, are test statistics!  By
       their nature, p-values should be uniformly  distributed  on  the  range
       0-1.   In  100+  test  runs  with  independent seeds, one should not be
       surprised to obtain 0, 1, 2, or even  (rarely)  3  p-values  less  than
       0.01.   On  the other hand obtaining 7 p-values in the range 0.24-0.25,
       or seeing that 70 of the p-values are greater than 0.5 should make  the
       generator  highly  suspect!   How  can  a user determine when a test is
       producing "too many" of any particular value range for p?  Or too few?

       Dieharder does it for you, automatically.  One can in  fact  convert  a
       set  of  p-values into a p-value by comparing their distribution to the
       expected one, using a  Kolmogorov-Smirnov  test  against  the  expected
       uniform distribution of p.

       These  p-values  obtained  from looking at the distribution of p-values
       should in turn be uniformly  distributed  and  could  in  principle  be
       subjected  to still more KS tests in aggregate.  The distribution of p-
       values for a good generator should be idempotent, even across different
       test statistics and multiple runs.

       A  failure  of the distribution of p-values at any level of aggregation
       signals trouble.  In fact, if  the  p-values  of  any  given  test  are
       subjected  to  a KS test, and those p-values are then subjected to a KS
       test, as we add more p-values to either level we  will  either  observe
       idempotence  of  the  resulting  distribution of p to uniformity, or we
       will observe idempotence to a single p-value of zero!  That is, a  good
       generator  will  produce a roughly uniform distribution of p-values, in
       the specific sense that the p-values of the distributions  of  p-values
       are  themselves  roughly  uniform  and  so on ad infinitum, while a bad
       generator will produce a non-uniform distribution of p-values,  and  as
       more  p-values drawn from the non-uniform distribution are added to its
       KS test, at some point the failure will be absolutely unmistakeable  as
       the resulting p-value approaches 0 in the limit.  Trouble indeed!

       The question is, trouble with what?  Random number tests are themselves
       complex computational objects, and there is a  probability  that  their
       code  is  incorrectly framed or that roundoff or other numerical -- not
       methodical  --  errors  are  contributing  to  a  distortion   of   the
       distribution  of  some  of  the p-values obtained.  This is not an idle
       observation; when one works on writing random number generator  testing
       programs,  one  is  always testing the tests themselves with "good" (we
       hope) random number generators so that egregious failures of  the  null
       hypothesis  signal  not  a bad generator but an error in the test code.
       The null hypothesis above is correctly framed from a theoretical  point
       of  view,  but  from a real and practical point of view it should read:
       "This generator is a perfect  random  number  generator,  and  for  any
       choice  of  seed produces a infinitely long, unique sequence of numbers
       that have all the expected statistical properties of random numbers, to
       all  orders  and  this  test  is  a  perfect test and returns precisely
       correct p-values from the test  computation."   Observed  "failure"  of
       this  joint null hypothesis H0' can come from failure of either or both
       of these disjoint components, and comes from the  second  as  often  or
       more  often  than  the first during the test development process.  When
       one cranks up the "resolution" of the test (discussed next) to where  a
       generator  starts  to  fail  some test one realizes, or should realize,
       that development never ends and  that  new  test  regimes  will  always
       reveal new failures not only of the generators but of the code.

       With  that  said, one of dieharder's most significant advantages is the
       control that it gives you over a critical  test  parameter.   From  the
       remarks  above, we can see that we should feel very uncomfortable about
       "failing" any given random number generator on the basis of  a  5%,  or
       even  a  1%,  criterion,  especially  when  we  apply a test suite like
       dieharder that returns over 100 (and climbing) distinct  test  p-values
       as  of  the  last  snapshot.   We  want  failure  to be unambiguous and

       To accomplish this, one can simply crank up its resolution.  If we  ran
       any  given  test against a random number generator and it returned a p-
       value of (say) 0.007328, we'd be perfectly justified in wondering if it
       is  really  a good generator.  However, the probability of getting this
       result isn't really all that small -- when one uses dieharder for hours
       at a time numbers like this will definitely happen quite frequently and
       mean nothing.  If one runs the same test again (with a  different  seed
       or  part  of the random sequence) and gets a p-value of 0.009122, and a
       third time and gets 0.002669 -- well, that's three 1% (or  less)  shots
       in  a  row and that should happen only one in a million times.  One way
       to clearly resolve failures, then, is to  increase  the  number  of  p-
       values  generated in a test run.  If the actual distribution of p being
       returned by the test is not uniform, a KS test will eventually return a
       p-value  that  is  not some ambiguous 0.035517 but is instead 0.000000,
       with the latter produced time after time as we rerun.

       For this reason, dieharder is extremely conservative  about  announcing
       rng  "weakness" or "failure" relative to any given test.  It's internal
       criterion for these things are currently p < 0.5% or p > 99.5% weakness
       (at the 1% level total) and a considerably more stringent criterion for
       failure: p < 0.05% or p >  99.95%.   Note  well  that  the  ranges  are
       symmetric -- too high a value of p is just as bad (and unlikely) as too
       low, and it is critical to flag it, because it is quite possible for  a
       rng  to be too good, on average, and not to produce enough low p-values
       on the full spectrum of dieharder  tests.   This  is  where  the  final
       kstest is of paramount importance, and where the "histogram" option can
       be very useful to help you visualize the failure in the distribution of
       p -- run e.g.:

         dieharder [whatever] -D default -D histogram

       and you will see a crude ascii histogram of the pvalues that failed (or
       passed) any given level of test.

       Scattered reports of weakness or  marginal  failure  in  a  preliminary
       -a(ll)  run should therefore not be immediate cause for alarm.  Rather,
       they are tests to repeat, to watch out for, to push the rng  harder  on
       using  the -m option to -a or simply increasing -p for a specific test.
       Dieharder permits one to increase the number of p-values generated  for
       any  test,  subject  only  to the availability of enough random numbers
       (for file based tests) and time, to make failures unambiguous.  A  test
       that  is  truly  weak  at -p 100 will almost always fail egregiously at
       some larger value of psamples, be it -p 1000 or  -p  100000.   However,
       because dieharder is a research tool and is under perpetual development
       and testing, it is strongly suggested  that  one  always  consider  the
       alternative  null  hypothesis  --  that the failure is a failure of the
       test code in dieharder itself in some limit of  large  numbers  --  and
       take  at  least  some  steps (such as running the same test at the same
       resolution on a "gold standard" generator) to ensure that  the  failure
       is indeed probably in the rng and not the dieharder code.

       Lacking  a  source  of  perfect  random  numbers to use as a reference,
       validating the tests themselves is not easy and always leaves one  with
       some  ambiguity  (even  aes or threefish).  During development the best
       one can usually do is to rely heavily on these "presumed  good"  random
       number  generators.   There  are  a  number  of generators that we have
       theoretical reasons to expect to be extraordinarily good  and  to  lack
       correlations out to some known underlying dimensionality, and that also
       test out extremely well quite  consistently.   By  using  several  such
       generators  and  not  just one, one can hope that those generators have
       (at the very least) different correlations and should not all uniformly
       fail a test in the same way and with the same number of p-values.  When
       all of these generators consistently fail a test at a  given  level,  I
       tend  to  suspect  that  the  problem  is  in  the  test  code, not the
       generators, although it is very  difficult  to  be  certain,  and  many
       errors in dieharder's code have been discovered and ultimately fixed in
       just this way by myself or others.

       One advantage of dieharder is that it  has  a  number  of  these  "good
       generators"  immediately available for comparison runs, courtesy of the
       Gnu Scientific Library and user contribution (notably David Bauer,  who
       kindly  encapsulated aes and threefish).  I use AES_OFB, Threefish_OFB,
       mt19937_1999, gfsr4, ranldx2  and  taus2  (as  well  as  "true  random"
       numbers  from  for  this purpose, and I try to ensure that
       dieharder will "pass" in particular the -g 205 -S 1 -s 1  generator  at
       any reasonable p-value resolution out to -p 1000 or farther.

       Tests (such as the diehard operm5 and sums test) that consistently fail
       at these high resolutions are flagged as being  "suspect"  --  possible
       failures  of  the  alternative null hypothesis -- and they are strongly
       deprecated!  Their results should not be used  to  test  random  number
       generators  pending  agreement  in  the  statistics  and  random number
       community that those tests are  in  fact  valid  and  correct  so  that
       observed  failures  can indeed safely be attributed to a failure of the
       intended null hypothesis.

       As I  keep  emphasizing  (for  good  reason!)  dieharder  is  community
       supported.   I therefore openly ask that the users of dieharder who are
       expert in statistics to help  me  fix  the  code  or  algorithms  being
       implemented.   I  would  like  to  see  this  test  suite ultimately be
       validated by the general statistics community in hard use  in  an  open
       environment,  where  every  possible  failure  of the testing mechanism
       itself is subject to scrutiny and eventual correction.  In this way  we
       will  eventually  achieve  a  very powerful suite of tools indeed, ones
       that may well give us very specific information not just about  failure
       but  of  the  mode  of  failure  as  well, just how the sequence tested
       deviates from randomness.

       Thus far, dieharder has benefitted  tremendously  from  the  community.
       Individuals have openly contributed tests, new generators to be tested,
       and fixes for existing tests that were revealed by their own work  with
       the  testing  instrument.   Efforts are underway to make dieharder more
       portable so that it will build on more platforms  and  faster  so  that
       more thorough testing can be done.  Please feel free to participate.


       The  simplest  way  to  use  dieharder  with an external generator that
       produces raw binary (presumed random) bits is to pipe  the  raw  binary
       output  from  this  generator (presumed to be a binary stream of 32 bit
       unsigned integers) directly into dieharder, e.g.:

         cat /dev/urandom | ./dieharder -a -g 200

       Go ahead and try this example.  It will run the entire dieharder  suite
       of  tests  on  the  stream  produced  by  the  linux built-in generator
       /dev/urandom (using /dev/random is not recommended as it is too slow to
       test in a reasonable amount of time).

       Alternatively,  dieharder can be used to test files of numbers produced
       by a candidate random number generators:

         dieharder -a -g 201 -f random.org_bin

       for raw binary input or

         dieharder -a -g 202 -f

       for formatted ascii input.

       A formatted ascii input file can accept either uints (integers  in  the
       range  0  to  2^31-1, one per line) or decimal uniform deviates with at
       least ten significant digits (that can be multiplied by UINT_MAX = 2^32
       to  produce  a  uint  without  dropping  precition), also one per line.
       Floats with fewer digits will almost  certainly  fail  bitlevel  tests,
       although they may pass some of the tests that act on uniform deviates.

       Finally,  one  can  fairly  easily wrap any generator in the same (GSL)
       random number harness used internally by dieharder and simply  test  it
       the  same  way  one  would  any  other internal generator recognized by
       dieharder.  This is strongly recommended where it is possible,  because
       dieharder  needs  to  use  a lot of random numbers to thoroughly test a
       generator.  A built in generator can simply let dieharder determine how
       many  it  needs  and  generate them on demand, where a file that is too
       small will "rewind" and render the test results where a  rewind  occurs

       Note  well  that file input rands are delivered to the tests on demand,
       but if the test needs more than are available  it  simply  rewinds  the
       file  and  cycles  through  it  again,  and again, and again as needed.
       Obviously this significantly reduces the sample space and can  lead  to
       completely  incorrect  results  for the p-value histograms unless there
       are enough rands to run EACH test without repetition (it is harmless to
       reuse the sequence for different tests).  Let the user beware!


       A  frequently asked question from new users wishing to test a generator
       they are working on for fun or profit (or both) is "How  should  I  get
       its  output  into  dieharder?"   This  is  a  nontrivial  question,  as
       dieharder consumes enormous numbers of random numbers in  a  full  test
       cycle,  and  then  there are features like -m 10 or -m 100 that let one
       effortlessly demand 10 or 100 times as many to stress a  new  generator
       even more.

       Even  with  large file support in dieharder, it is difficult to provide
       enough random numbers in a file to really make dieharder happy.  It  is
       therefore strongly suggested that you either:

       a)  Edit the output stage of your random number generator and get it to
       write its production to stdout as a  random  bit  stream  --  basically
       create  32  bit  unsigned  random  integers  and write them directly to
       stdout as e.g. char data or raw binary.  Note that this is not the same
       as  writing  raw floating point numbers (that will not be random at all
       as a bitstream) and that "endianness" of the uints  should  not  matter
       for  the  null  hypothesis  of  a "good" generator, as random bytes are
       random in any order.  Crank the  generator  and  feed  this  stream  to
       dieharder in a pipe as described above.

       b) Use the samples of GSL-wrapped dieharder rngs to similarly wrap your
       generator (or calls to your generator's  hardware  interface).   Follow
       the  examples in the ./dieharder source directory to add it as a "user"
       generator in the  command  line  interface,  rebuild,  and  invoke  the
       generator  as  a  "native" dieharder generator (it should appear in the
       list produced by -g -1 when done correctly).  The advantage of doing it
       this  way  is  that  you  can  then  (if  your  new generator is highly
       successful) contribute it back to the dieharder project  if  you  wish!
       Not to mention the fact that it makes testing it very easy.

       Most  users  will probably go with option a) at least initially, but be
       aware that b)  is  probably  easier  than  you  think.   The  dieharder
       maintainers  may  be  able  to  give you a hand with it if you get into
       trouble, but no promises.


       A warning for those who are testing files of random numbers.  dieharder
       is  a  tool  that  tests  random number generators, not files of random
       numbers!  It is extremely inappropriate to try to "certify" a  file  of
       random  numbers  as being random just because it fails to "fail" any of
       the dieharder tests in e.g. a dieharder -a run.  To put it bluntly,  if
       one rejects all such files that fail any test at the 0.05 level (or any
       other), the one thing one can be  certain  of  is  that  the  files  in
       question  are  not  random,  as  a truly random sequence would fail any
       given test at the 0.05 level 5% of the time!

       To put it another way, any file of numbers produced by a generator that
       "fails to fail" the dieharder suite should be considered "random", even
       if it contains sequences that might well "fail" any given test at  some
       specific  cutoff.  One has to presume that passing the broader tests of
       the generator itself, it was determined that the p-values for the  test
       involved  was  globally  correctly distributed, so that e.g. failure at
       the 0.01 level occurs neither more nor less than 1%  of  the  time,  on
       average,  over  many  many  tests.   If one particular file generates a
       failure at this level, one can therefore safely presume that  it  is  a
       random  file  pulled from many thousands of similar files the generator
       might create that have the correct  distribution  of  p-values  at  all
       levels of testing and aggregation.

       To  sum  up,  use  dieharder to validate your generator (via input from
       files or an embedded stream).  Then by all means use your generator  to
       produce files or streams of random numbers.  Do not use dieharder as an
       accept/reject tool to validate the files themselves!


       To demonstrate all tests, run on the default GSL rng, enter:

         dieharder -a

       To demonstrate a test of an external generator of a raw  binary  stream
       of bits, use the stdin (raw) interface:

         cat /dev/urandom | dieharder -g 200 -a

       To use it with an ascii formatted file:

         dieharder -g 202 -f testrands.txt -a

       (testrands.txt should consist of a header such as:

        # generator mt19937_1999  seed = 1274511046
        type: d
        count: 100000
        numbit: 32


       To use it with a binary file

         dieharder -g 201 -f testrands.bin -a


         cat testrands.bin | dieharder -g 200 -a

       An  example that demonstrates the use of "prefixes" on the output lines
       that make it relatively easy to filter off the different parts  of  the
       output  report  and chop them up into numbers that can be used in other
       programs or in spreadsheets, try:

         dieharder -a -c ',' -D default -D prefix


       As of version 3.x.x, dieharder  has  a  single  output  interface  that
       produces  tabular  data  per  test, with common information in headers.
       The display control options and flags can  be  used  to  customize  the
       output to your individual specific needs.

       The  options are controlled by binary flags.  The flags, and their text
       versions, are displayed if you enter:

         dieharder -F

       by itself on a line.

       The flags can be entered all at once  by  adding  up  all  the  desired
       option  flags.   For example, a very sparse output could be selected by
       adding the flags for the test_name (8) and the associated pvalues (128)
       to get 136:

         dieharder -a -D 136

       Since  the flags are cumulated from zero (unless no flag is entered and
       the default is used) you could accomplish the same display via:

         dieharder -a -D 8 -D pvalues

       Note that you can enter flags by value or by name, in any  combination.
       Because  people  use dieharder to obtain values and then with to export
       them into spreadsheets (comma separated values) or into filter scripts,
       you can chance the field separator character.  For example:

         dieharder -a -c ',' -D default -D -1 -D -2

       produces  output  that  is ideal for importing into a spreadsheet (note
       that one can subtract field values from the base set of fields provided
       by the default option as long as it is given first).

       An  interesting  option  is  the -D prefix flag, which turns on a field
       identifier prefix to make it easy to filter  out  particular  kinds  of
       data.   However,  it  is equally easy to turn on any particular kind of
       output to the exclusion of others directly by means of the flags.

       Two other flags of interest  to  novices  to  random  number  generator
       testing  are  the  -D histogram (turns on a histogram of the underlying
       pvalues, per test)  and  -D  description  (turns  on  a  complete  test
       description, per test).  These flags turn the output table into more of
       a series of "reports" of each test.


       dieharder is entirely original code and can be  modified  and  used  at
       will by any user, provided that:

         a) The original copyright notices are maintained and that the source,
       including all modifications, is made publically available at  the  time
       of  any derived publication.  This is open source software according to
       the  precepts  and  spirit  of  the  Gnu  Public  License.    See   the
       accompanying    file   COPYING,   which   also   must   accompany   any

         b) The primary author of the code (Robert G. Brown) is  appropriately
       acknowledged and referenced in any derived publication.  It is strongly
       suggested that George Marsaglia and the Diehard suite and  the  various
       authors  of  the  Statistical  Test  Suite  be  similarly acknowledged,
       although this suite shares no actual code with these random number test

         c)   Full   responsibility   for   the   accuracy,  suitability,  and
       effectiveness of the program rests with the users and/or modifiers.  As
       is clearly stated in the accompanying copyright.h:



       The  author of this suite gratefully acknowledges George Marsaglia (the
       author of the diehard test suite)  and  the  various  authors  of  NIST
       Special  Publication 800-22 (which describes the Statistical Test Suite
       for  testing   pseudorandom   number   generators   for   cryptographic
       applications),  for excellent descriptions of the tests therein.  These
       descriptions enabled this suite to be developed with a GPL.

       The author also wishes to reiterate that the academic  correctness  and
       accuracy   of   the   implementation   of   these  tests  is  his  sole
       responsibility and not that of  the  authors  of  the  Diehard  or  STS
       suites.   This is especially true where he has seen fit to modify those
       tests from their strict original descriptions.


       GPL 2b; see the file  COPYING  that  accompanies  the  source  of  this
       program.  This is the "standard Gnu General Public License version 2 or
       any  later  version",  with  the  one   minor   (humorous)   "Beverage"
       modification listed below.  Note that this modification is probably not
       legally defensible and can be followed really pretty much according  to
       the honor rule.

       As  to my personal preferences in beverages, red wine is great, beer is
       delightful, and Coca Cola or coffee or tea or even milk  acceptable  to
       those  who for religious or personal reasons wish to avoid stressing my

       The Beverage Modification to the GPL:

       Any satisfied user of this software shall,  upon  meeting  the  primary
       author(s)  of  this  software  for the first time under the appropriate
       circumstances, offer to buy him  or  her  or  them  a  beverage.   This
       beverage may or may not be alcoholic, depending on the personal ethical
       and moral views of the offerer.  The beverage cost need not exceed  one
       U.S.  dollar  (although  it certainly may at the whim of the offerer:-)
       and may be accepted or declined with no further obligation on the  part
       of  the  offerer.   It  is  not necessary to repeat the offer after the
       first meeting, but it can't hurt...