Ubuntu Manpage: Test::Harness::Beyond


       Test::Harness::Beyond - Beyond make test


       Test::Harness is responsible for running test scripts, analysing their output and
       reporting success or failure. When I type make test (or ./Build test) for a module,
       Test::Harness is usually used to run the tests (not all modules use Test::Harness but the
       majority do).

       To start exploring some of the features of Test::Harness I need to switch from make test
       to the prove command (which ships with Test::Harness). For the following examples I'll
       also need a recent version of Test::Harness installed; 3.14 is current as I write.

       For the examples I'm going to assume that we're working with a 'normal' Perl module
       distribution. Specifically I'll assume that typing make or ./Build causes the built,
       ready-to-install module code to be available below ./blib/lib and ./blib/arch and that
       there's a directory called 't' that contains our tests. Test::Harness isn't hardwired to
       that configuration but it  saves me from explaining which files live where for each
       example.

       Back to prove; like make test it runs a test suite - but it provides far more control over
       which tests are executed, in what order and how their results are reported. Typically make
       test runs all the test scripts below the 't' directory. To do the same thing with prove I
       type:

         prove -rb t

       The switches here are -r to recurse into any directories below 't' and -b which adds
       ./blib/lib and ./blib/arch to Perl's include path so that the tests can find the code they
       will be testing. If I'm testing a module of which an earlier version is already installed
       I need to be careful about the include path to make sure I'm not running my tests against
       the installed version rather than the new one that I'm working on.

       Unlike make test, typing prove doesn't automatically rebuild my module. If I forget to
       make before prove I will be testing against older versions of those files - which
       inevitably leads to confusion.  I either get into the habit of typing

         make && prove -rb t

       or - if I have no XS code that needs to be built I use the modules below lib instead

         prove -Ilib -r t

       So far I've shown you nothing that make test doesn't do. Let's fix that.

   Saved State
       If I have failing tests in a test suite that consists of more than a handful of scripts
       and takes more than a few seconds to run it rapidly becomes tedious to run the whole test
       suite repeatedly as I track down the problems.

       I can tell prove just to run the tests that are failing like this:

         prove -b t/this_fails.t t/so_does_this.t

       That speeds things up but I have to make a note of which tests are failing and make sure
       that I run those tests. Instead I can use prove's --state switch and have it keep track of
       failing tests for me. First I do a complete run of the test suite and tell prove to save
       the results:

         prove -rb --state=save t

       That stores a machine readable summary of the test run in a file called '.prove' in the
       current directory. If I have failures I can then run just the failing scripts like this:

         prove -b --state=failed

       I can also tell prove to save the results again so that it updates its idea of which tests
       failed:

         prove -b --state=failed,save

       As soon as one of my failing tests passes it will be removed from the list of failed
       tests. Eventually I fix them all and prove can find no failing tests to run:

         Files=0, Tests=0, 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
         Result: NOTESTS

       As I work on a particular part of my module it's most likely that the tests that cover
       that code will fail. I'd like to run the whole test suite but have it prioritize these
       'hot' tests. I can tell prove to do this:

         prove -rb --state=hot,save t

       All the tests will run but those that failed most recently will be run first. If no tests
       have failed since I started saving state all tests will run in their normal order. This
       combines full test coverage with early notification of failures.

       The --state switch supports a number of options; for example to run failed tests first
       followed by all remaining tests ordered by the timestamps of the test scripts - and save
       the results - I can use

         prove -rb --state=failed,new,save t

       See the prove documentation (type prove --man) for the full list of state options.

       When I tell prove to save state it writes a file called '.prove' ('_prove' on Windows) in
       the current directory. It's a YAML document so it's quite easy to write tools of your own
       that work on the saved test state - but the format isn't officially documented so it might
       change without (much) warning in the future.

   Parallel Testing
       If my tests take too long to run I may be able to speed them up by running multiple test
       scripts in parallel. This is particularly effective if the tests are I/O bound or if I
       have multiple CPU cores. I tell prove to run my tests in parallel like this:

         prove -rb -j 9 t

       The -j switch enables parallel testing; the number that follows it is the maximum number
       of tests to run in parallel. Sometimes tests that pass when run sequentially will fail
       when run in parallel. For example if two different test scripts use the same temporary
       file or attempt to listen on the same socket I'll have problems running them in parallel.
       If I see unexpected failures I need to check my tests to work out which of them are
       trampling on the same resource and rename temporary files or add locks as appropriate.

       To get the most performance benefit I want to have the test scripts that take the longest
       to run start first - otherwise I'll be waiting for the one test that takes nearly a minute
       to complete after all the others are done. I can use the --state switch to run the tests
       in slowest to fastest order:

         prove -rb -j 9 --state=slow,save t

   Non-Perl Tests
       The Test Anything Protocol (http://testanything.org/) isn't just for Perl. Just about any
       language can be used to write tests that output TAP. There are TAP based testing libraries
       for C, C++, PHP, Python and many others. If I can't find a TAP library for my language of
       choice it's easy to generate valid TAP. It looks like this:

         1..3
         ok 1 - init OK
         ok 2 - opened file
         not ok 3 - appended to file

       The first line is the plan - it specifies the number of tests I'm going to run so that
       it's easy to check that the test script didn't exit before running all the expected tests.
       The following lines are the test results - 'ok' for pass, 'not ok' for fail. Each test has
       a number and, optionally, a description. And that's it. Any language that can produce
       output like that on STDOUT can be used to write tests.

       Recently I've been rekindling a two-decades-old interest in Forth.  Evidently I have a
       masochistic streak that even Perl can't satisfy.  I want to write tests in Forth and run
       them using prove (you can find my gforth TAP experiments at
       https://svn.hexten.net/andy/Forth/Testing/). I can use the --exec switch to tell prove to
       run the tests using gforth like this:

         prove -r --exec gforth t

       Alternately, if the language used to write my tests allows a shebang line I can use that
       to specify the interpreter. Here's a test written in PHP:

         #!/usr/bin/php
         <?php
           print "1..2\n";
           print "ok 1\n";
           print "not ok 2\n";
         ?>

       If I save that as t/phptest.t the shebang line will ensure that it runs correctly along
       with all my other tests.

   Mixing it up
       Subtle interdependencies between test programs can mask problems - for example an earlier
       test may neglect to remove a temporary file that affects the behaviour of a later test. To
       find this kind of problem I use the --shuffle and --reverse options to run my tests in
       random or reversed order.

   Rolling My Own
       If I need a feature that prove doesn't provide I can easily write my own.

       Typically you'll want to change how TAP gets input into and output from the parser.
       App::Prove supports arbitrary plugins, and TAP::Harness supports custom formatters and
       source handlers that you can load using either prove or Module::Build; there are many
       examples to base mine on.  For more details see App::Prove, TAP::Parser::SourceHandler,
       and TAP::Formatter::Base.

       If writing a plugin is not enough, you can write your own test harness; one of the motives
       for the 3.00 rewrite of Test::Harness was to make it easier to subclass and extend.

       The Test::Harness module is a compatibility wrapper around TAP::Harness.  For new
       applications I should use TAP::Harness directly. As we'll see, prove uses TAP::Harness.

       When I run prove it processes its arguments, figures out which test scripts to run and
       then passes control to TAP::Harness to run the tests, parse, analyse and present the
       results. By subclassing TAP::Harness I can customise many aspects of the test run.

       I want to log my test results in a database so I can track them over time. To do this I
       override the summary method in TAP::Harness.  I start with a simple prototype that dumps
       the results as a YAML document:

         package My::TAP::Harness;

         use base 'TAP::Harness';
         use YAML;

         sub summary {
           my ( $self, $aggregate ) = @_;
           print Dump( $aggregate );
           $self->SUPER::summary( $aggregate );
         }

         1;

       I need to tell prove to use my My::TAP::Harness. If My::TAP::Harness is on Perl's @INC
       include path I can

         prove --harness=My::TAP::Harness -rb t

       If I don't have My::TAP::Harness installed on @INC I need to provide the correct path to
       perl when I run prove:

         perl -Ilib `which prove` --harness=My::TAP::Harness -rb t

       I can incorporate these options into my own version of prove. It's pretty simple. Most of
       the work of prove is handled by App::Prove.  The important code in prove is just:

         use App::Prove;

         my $app = App::Prove->new;
         $app->process_args(@ARGV);
         exit( $app->run ? 0 : 1 );

       If I write a subclass of App::Prove I can customise any aspect of the test runner while
       inheriting all of prove's behaviour. Here's myprove:

         #!/usr/bin/env perl use lib qw( lib );      # Add ./lib to @INC
         use App::Prove;

         my $app = App::Prove->new;

         # Use custom TAP::Harness subclass
         $app->harness( 'My::TAP::Harness' );

         $app->process_args( @ARGV ); exit( $app->run ? 0 : 1 );

       Now I can run my tests like this

         ./myprove -rb t

   Deeper Customisation
       Now that I know how to subclass and replace TAP::Harness I can replace any other part of
       the harness. To do that I need to know which classes are responsible for which
       functionality. Here's a brief guided tour; the default class for each component is shown
       in parentheses. Normally any replacements I write will be subclasses of these default
       classes.

       When I run my tests TAP::Harness creates a scheduler (TAP::Parser::Scheduler) to work out
       the running order for the tests, an aggregator (TAP::Parser::Aggregator) to collect and
       analyse the test results and a formatter (TAP::Formatter::Console) to display those
       results.

       If I'm running my tests in parallel there may also be a multiplexer
       (TAP::Parser::Multiplexer) - the component that allows multiple tests to run
       simultaneously.

       Once it has created those helpers TAP::Harness starts running the tests. For each test it
       creates a new parser (TAP::Parser) which is responsible for running the test script and
       parsing its output.

       To replace any of these components I call one of these harness methods with the name of
       the replacement class:

         aggregator_class
         formatter_class
         multiplexer_class
         parser_class
         scheduler_class

       For example, to replace the aggregator I would

         $harness->aggregator_class( 'My::Aggregator' );

       Alternately I can supply the names of my substitute classes to the TAP::Harness
       constructor:

         my $harness = TAP::Harness->new(
           { aggregator_class => 'My::Aggregator' }
         );

       If I need to reach even deeper into the internals of the harness I can replace the classes
       that TAP::Parser uses to execute test scripts and tokenise their output. Before running a
       test script TAP::Parser creates a grammar (TAP::Parser::Grammar) to decode the raw TAP
       into tokens, a result factory (TAP::Parser::ResultFactory) to turn the decoded TAP results
       into objects and, depending on whether it's running a test script or reading TAP from a
       file, scalar or array a source or an iterator (TAP::Parser::IteratorFactory).

       Each of these objects may be replaced by calling one of these parser methods:

         source_class
         perl_source_class
         grammar_class
         iterator_factory_class
         result_factory_class

   Callbacks
       As an alternative to subclassing the components I need to change I can attach callbacks to
       the default classes. TAP::Harness exposes these callbacks:

         parser_args      Tweak the parameters used to create the parser
         made_parser      Just made a new parser
         before_runtests  About to run tests
         after_runtests   Have run all tests
         after_test       Have run an individual test script

       TAP::Parser also supports callbacks; bailout, comment, plan, test, unknown, version and
       yaml are called for the corresponding TAP result types, ALL is called for all results,
       ELSE is called for all results for which a named callback is not installed and EOF is
       called once at the end of each TAP stream.

       To install a callback I pass the name of the callback and a subroutine reference to
       TAP::Harness or TAP::Parser's callback method:

         $harness->callback( after_test => sub {
           my ( $script, $desc, $parser ) = @_;
         } );

       I can also pass callbacks to the constructor:

         my $harness = TAP::Harness->new({
           callbacks => {
                   after_test => sub {
               my ( $script, $desc, $parser ) = @_;
               # Do something interesting here
                   }
           }
         });

       When it comes to altering the behaviour of the test harness there's more than one way to
       do it. Which way is best depends on my requirements. In general if I only want to observe
       test execution without changing the harness' behaviour (for example to log test results to
       a database) I choose callbacks. If I want to make the harness behave differently
       subclassing gives me more control.

   Parsing TAP
       Perhaps I don't need a complete test harness. If I already have a TAP test log that I need
       to parse all I need is TAP::Parser and the various classes it depends upon. Here's the
       code I need to run a test and parse its TAP output

         use TAP::Parser;

         my $parser = TAP::Parser->new( { source => 't/simple.t' } );
         while ( my $result = $parser->next ) {
           print $result->as_string, "\n";
         }

       Alternately I can pass an open filehandle as source and have the parser read from that
       rather than attempting to run a test script:

         open my $tap, '<', 'tests.tap'
           or die "Can't read TAP transcript ($!)\n";
         my $parser = TAP::Parser->new( { source => $tap } );
         while ( my $result = $parser->next ) {
           print $result->as_string, "\n";
         }

       This approach is useful if I need to convert my TAP based test results into some other
       representation. See TAP::Convert::TET (http://search.cpan.org/dist/TAP-Convert-TET/) for
       an example of this approach.

   Getting Support
       The Test::Harness developers hang out on the tapx-dev mailing list[1]. For discussion of
       general, language independent TAP issues there's the tap-l[2] list. Finally there's a wiki
       dedicated to the Test Anything Protocol[3]. Contributions to the wiki, patches and
       suggestions are all welcome.

       [1] <http://www.hexten.net/mailman/listinfo/tapx-dev> [2]
       <http://testanything.org/mailman/listinfo/tap-l> [3] <http://testanything.org/>

NAME

Beyond make test