I’ve been on vacation for the past week, and I decided to take a look at using Test2 to reimplement the core of Test::Class::Moose.
Test::Class::Moose
(TCM) lets you write tests in the form of Moose classes. Your classes are
constructed and run by the TCM test runner. For each class, we constructor instances of the class
and then run the test_*
methods provided by that instance. We run the class itself in a subtest,
as well as each method. This leads to a lot of nested subtests (I’ll tell you why this matters
later). Here’s an example TCM class:
|
|
Currently, TCM is implemented on top of the existing Perl test stack consisting of Test::Builder
,
Test::More
, etc.
The fundamental problem with the existing test stack is that it is not abstract enough. The test
stack is all about producing TAP (the Test Anything Protocol). This is the text-based format
(mostly line-oriented) that you see when you run prove -v
. It’s possible to produce other
types of output or capture the test output to examine it, but it’s not nearly as easy as it should
be.
TAP is great for end users. It’s easy to read, and when tests fail it’s usually easy to see what happened. But it’s not so great for machines. The line-oriented protocol isn’t great for things like expressing a complex data structure, and the output format simply doesn’t allow you to express certain distinctions (diagnostics versus error messages, for example). Even worse is how the current TAP ecosystem handles subtests, which can be summarized as “it doesn’t handle subtests at all”. Here’s an example program:
|
|
If we run this via prove -v
we get this:
|
|
What happened there? Well, the TAP ecosystem more or less ignore the contents of a subtest. Any
line starting with space is treated as “unknown text”. What Test::Builder
does is keep track of
the subtest’s pass/fail status in order to print a test result at the next level up the stack
summarizing the subtest. That’s the ok 2 - this gets weird
line up above. Because it’s not
actually parsing the contents of the subtest, it doesn’t see that the test count is wrong or that
some tests have failed.
In practice, this won’t affect most code. As long as all your tests are emitted via Test::Builder you’re good to go. It does make life much harder for tools that want to actually look at the contents of subtests, in particular tools that want to emit a non-TAP format.
The core test stack tooling around concurrency is also fairly primitive. The test harness supports concurrency at the process level. It can fork off multiple test processes, track their TAP output separately, and generate a summary of the results. However, you cannot easily fork from inside a test process and emit concurrent TAP.
This concurrency issue really bit Test::Class::Moose
. Unlike traditional Perl test suites, with
TCM you normally run all of your tests starting from a single whatever.t
file. That file contains
just a few lines of code to create a TCM runner. The runner loads all of your test classes and
executes them. Here’s an example:
|
|
Ovid is a smart guy. He realized that once you have enough test classes, you’d really want to be able to run them concurrently. So he wrote TAP::Stream. This modules let you combine multiple streams of subtest-level TAP into a single top-level TAP stream.
This is completely and utterly insane! This is not Ovid’s fault. He was doing the best he could with the tools that existed. But it’s terribly fragile, and it’s way more work than it should be. It also made it incredibly difficult to provide feature parity between the parallel and sequential TCM test execution code. The parallel code has always been a bit broken, and there was a lot of nearly duplicated code between the two execution paths.
Enter Test2, which is Chad Granum (Exodist’s) project to implement a proper event-level
abstraction on top of all the test infrastructure. With Test2
, our fundamental layer is a stream
of events, not TAP. An event is a test result, a diagnostic, a subtest, etc. Subtests are proper
first class events which can in turn contain other events.
Working at this level makes writing TCM much easier. There’s still some trickiness involved in starting a subtest in one process but executing it’s contents in another, but the amount of duplicated code is greatly reduced, and it’s much easier to achieve feature parity between the parallel and sequential paths.
As a huge, huge bonus, testing tools built on top of Test2
is a pleasure instead of a chore. The
sad truth about TCM is that it was never as well tested as it should have been. The tools for
testing with Test::Builder
are primitive at best, and because of the fact that subtests are
ignored by TAP, the testing tools were nearly useless for TCM.
With Test2
we can capture and examine the event stream of a test run in incredible detail. This
lets me write very detailed tests for the behavior of TCM in all sorts of success and failure
scenarios, which is fantastically useful. Here’s a snippet of what this looks like:
|
|
The test_events_is
sub is a helper I wrote using the Test2
tools. All it does is add some useful
diagnostic output if the event stream from running TCM contains Test2::Event::Exception
events. And the diagnostics from Test2 are simply beautiful:
|
|
It’s a lot to read but it’s incredibly detailed and makes understanding why a test failed much easier than the current test stack.
Chad is currently working on finishing up Test2
and making sure that it’s stable and
backwards-compatible enough to replace the existing test suite stack. Once Test::More
,
Test::Builder
, and friends are all running on top of Test2, it will make it much easier to write
new test tools that integrate with this infrastructure.
The future of testing in Perl 5 is looking bright! And Perl 6 isn’t being left behind. I’ve been working on a similar project in Perl 6 with the current placeholder name of Test::Stream. This is a little easier than the Perl 5 effort since there’s no large body of test tools with which I need to ensure backwards compatibility. I want Perl 6 to have the same excellent level of test infrastructure that Perl 5 is going to be enjoying soon.
Comments
Timm Murray, on 2016-02-29 06:48, said:
There was a proposal on the old testanything.org web site that would have handled forking streams:
http://web.archive.org/web/20110611083912/http://testanything.org/wiki/index.php/Test_Groups
Not the prettiest output, but it should work.