Having recently worked quite a bit on some testing tools, including tools to parse TAP, I’ve become intimately familiar with its shortcomings. I’m going to write these up here in the hopes that future generations of test output format developers will not repeat these mistakes.

TAP is (mostly) well-suited for human consumption. It’s easy to read, and when tests fail, it’s easy to figure out what failed and why. But the simplicity that makes it easy to read also makes it really difficult to parse properly. This leads to heuristics, and heuristics are the devil.

No Connection Between Connected Output

What happens when this test code fails:

We get this output:

This is nice and readable. It’s very clear what went wrong. But from a parsing perspective it’s a nightmare. There is no indication that the diagnostic output (the lines starting with #) are connected to the previous not ok line. So you have to implement a heuristic in your parser along the lines of “diagnostic output after a test is connected to that test”. That might be correct, or it might not. Consider this code:

Our output looks like this:

Now our heuristic about diagnostic output is clearly wrong.

There are further heuristics we could try. We could parse the diagnostic content and look for leading space, but this really depends on the vagaries of how each test tool formats its output.

Figuring out what the output means is trivial for a human, but incredibly hard to code for.

Multiple Output Handles for One Stream

TAP sends output to both stdout and stderr in parallel. Most of of the output goes to stdout, but diagnostic output goes to stderr. This causes all sorts of problems.

The stdout handle is usually buffered but stderr is not. In practice, this means that the output gets weird, with stderr output sometimes appearing much earlier than its related stdout output. One way to fix this is to enable autoflush on STDOUT in the test code itself.

With prove you can also pass the -m flag to merge the streams, which generally produces saner interleaving. The downside of merging is that you can no longer tell whether a line like # foo is diagnostic-level output, because that same line could be sent to stdout as a note.

Needless to say, having multiple handles contributes to more parsing problems. The problem discussed above of grouping together lines of output becomes much harder when those lines are split across two handles.

The Epic Subtest Hack

The original versions of TAP had no concept of subtests. It may surprise you to know that modern TAP has no concept of subtests either!

Subtests are a clever hack that take advantage of the original TAP spec’s simplicity. A normal TAP parser only cares about lines that start with non-whitespace. For example:

A TAP parser should only see one test here. The second line is ignored because of its leading whitespace.

Subtests take advantage of this to present human-readable output that the TAP parser ignores. Take this code for example:

The output looks like this:

TAP parsers will ignore all of the indented output. All the parser sees is the first and last lines. The lines that start with whitespace are ignored.

This is a great hack, since it let test tools generate subtest output while parsing tools like Test::Harness continue to work unchanged.

Unfortunately, this also means that none of the parsing tools until Test2::Harness::Parser::TAP actually parsed subtest output. If you were trying to transform TAP output into something else you had to write your own subtest parsing, which is pretty hard to get right. I think Test2::Harness::Parser::TAP is pretty good at this point, but getting there was no mean feat.

None of This Matters Any More

With Test2 out, all of my complaints are more or less irrelevant. Internally, Test2 represents testing events as event objects. If we want to understand what’s happening in our tests, we can look at these events. If we want to run tests in a separate process and capture the event output, we can tell those test processes to use the Test2::Formatter::EventStream formatter instead of TAP. This formatter dumps events as JSON objects which we can easily parse and turn back into event objects in another process. As of the 1.302074 release of Test2, events include a cid (context id) attribute. Multiple events with the same cid are related, so a test ok/not ok event followed by diagnostics can be easily grouped together.

The future of testing tools is looking very bright with Test2! TAP still drives me crazy, but now it’s mostly doing that in theory rather than in practice.

My last day at MaxMind will be January 5, 2017. For the curious, I left of my own free will. I was not laid off, fired, put into a cannon and launched into space, or NDA’d as seen on the TV show Incorporated.

So now it’s time to find something else to do to fill my time and checking account. I’m open to both consulting work and full-time employment.

My ideal position would look a lot like what I did at MaxMind. I would absolutely love to help a company grow its engineering side, including working on hiring, product development, development processes, technical standards, etc. I really enjoyed this part of my position at MaxMind, where as Software Engineering Team Lead I helped the company grow from 3 to 15 engineering staff over about 5 years.

That said, I’m not too keen on working 80 hours a week at a startup. I’ve done the startup thing, and it was fun, but also stressful and chaotic. Instead, I’d love to help a small profitable company grow into a larger, more profitable company.

I realize that such positions are few and far between, and my position at MaxMind may have been a unicorn. So my not-quite-ideal-but-still-very-satisfying position would be in some sort of technical leadership position at a company, where I can provide technical leadership and mentoring to others, as well as writing code. And I’m open to being just a developer too, as long as my employer already has their stuff together. Please note I’m not looking to move into a position that is just management. I still enjoy coding and solving technical challenges.

My position at MaxMind was four 8 hour days a week. I loved that schedule, and would love to continue it. Something along those lines would be incredibly attractive.

I live in Minneapolis, MN and will not relocate, but I am open to a reasonable amount of travel for face to face interactions.

You can read my resume online. Please contact me if you’re looking for someone like me.

Have you seen my new module, Params::ValidationCompiler? It does pretty much everything that MooseX::Params::Validate and Params::Validate do, but way faster. As such, I don’t plan on using either of those modules in new code, and I’ll be converting over my old code as I get the chance. I’d suggest that you consider doing the same. The speed gains are quite significant from my benchmarks.

Since I’m not going to use them any more, these two modules could use some maintenance love. Please contact me if you’re interested.

I’ve been thinking about DateTime recently and I’ve come to the conclusion that the Perl community would be much better off if there was a DateTime core team maintaining the core DateTime modules. DateTime.pm, the main module, is used by several thousand other CPAN distros, either directly or indirectly. Changes to DateTime.pm (or anything that it in turn relies on) have a huge impact on CPAN.

I’ve been maintaining DateTime.pm, DateTime::Locale, and DateTime::TimeZone as a mostly solo effort for a long time, but that’s not a good thing. The main thing I’d like from other team members is a commitment to review PRs on a regular basis. I think that having some sort of code review on changes I propose would be very helpful. Of course, if you’re willing to respond to bugs, write code, do releases, and so on, that’s even better.

Please comment on this blog post if you’re interested in this. Some things to think about include …

  • What sort of work are you comfortable doing? The work includes code review, responding to bug reports, writing code to fix bugs and/or add features, testing on platforms not supported by Travis, and doing releases.
  • How would you like to communicate about these things? There is an existing datetime@perl.org list, but I generally prefer IRC or Slack for code discussion.
  • Would you prefer to use GH issues instead of RT? (I’m somewhat leaning towards yes, but I’m okay with leaving things in RT too)?

The same request for maintenance help really applies to anything else I maintain that is widely used, including Params::Validate (which I’m no longer planning to use in new code myself) and Log::Dispatch. I’d really love to have more help maintaining all of this stuff.

If you have something to say that you’re not comfortable saying in a comment, feel free to email me.

My employer MaxMind is hiring for two engineering positions. We have a positions for a Software Engineer in Test and a Software Engineer. If you’ve always wanted to work with me, here’s your chance. If you’ve always wanted to avoid working with me, now you have the knowledge needed to achieve that goal. It’s a win-win either way!

Note that while this is a remote position, we’re pretty limited in what US states we can hire from (Massachusetts, Minnesota, Montana, North Carolina, and Oregon). All of Canada is fair game. I’m trying to figure out if we can expand the state pool somehow. If you think you’re the awesomest candidate ever, send your resume anyway. That way if something does change, we have you on our list.

I recently released a new parameter validation module tentatively called Params::CheckCompiler (aka PCC, better name suggestions welcome) (Edit: Now renamed to Params::ValidationCompiler). Unlike Params::Validate (aka PV), this new module generates a highly optimized type checking subroutine for a given set of parameters. If you use a type system capable of generating inlined code, this can be quite fast. Note that all of the type systems supported by PCC allow inlining (Moose, Type::Tiny, and Specio).

I’ve been working on a branch of DateTime that uses PCC. Parameter validation, especially for constructors, is a significant contributor to slowness in DateTime. The branch, for the curious.

I wrote a simple benchmark to compare the speed of DateTime->new with PCC vs PV:

Running it with master produces:

autarch@houseabsolute:~/projects/DateTime.pm (master $%=)$ perl -Mblib ./bench.pl 
Benchmark: timing 100000 iterations of constructor...
constructor:  6 wallclock secs ( 6.11 usr +  0.00 sys =  6.11 CPU) @ 16366.61/s (n=100000)

And with the use-pcc branch:

autarch@houseabsolute:~/projects/DateTime.pm (use-pcc $%=)$ perl -I ../Specio/lib/ -I ../Params-CheckCompiler/lib/ -Mblib ./bench.pl 
Benchmark: timing 100000 iterations of constructor...
constructor:  5 wallclock secs ( 5.34 usr +  0.01 sys =  5.35 CPU) @ 18691.59/s (n=100000)

So we can see that’s there’s a speedup of about 14%, which is pretty good!

I figured that this should be reflected in the speed of the entire test suite, so I started timing that between the two branches. But I was wrong. The use-pcc branch took about 15s to run versus 11s for master! What was going on?

After some profiling, I finally realized that while using PCC with Specio sped up run time noticeably, it also adds an additional compile time hit. It’s Moose all over again, though not nearly as bad.

For further comparison, I used the Test2::Harness release’s yath test harness script and told it to preload DateTime. Now the test suite runs slightly faster in the use-pcc branch, about 4% or so.

So where does that leave things?

One thing I’m completely sure of is that if you’re using MooseX::Params::Validate (aka MXPV), then switching to Params::CheckCompiler is going to be a huge win. This was my original use case for PCC, since some profiling at work showed MXPV as a hot spot in some code paths. I have some benchmarks comparing MXPV and PCC I will post here some time that show PCC as about thirty times faster.

Switching from PV to PCC is less obvious. If your module is already using a type system for its constructor, then there are no extra dependencies, so the small compile time hit may be worth it.

In the case of DateTime, adding PCC alone adds a number of dependencies and Specio adds a few more to that. “Why use Specio over Type::Tiny?”, you may wonder. Well, Type::Tiny doesn’t support overloading for core types, for one thing. I noticed some DateTime tests checking that you can use an object which overloads numification in some cases. I don’t remember why I added that, but I suspect it was to address some bug or make sure that DateTime played nice with some other module. I don’t want to break that, and I don’t want to build my own parallel set of Type::Tiny types with overloading support. Plus I really don’t like the design of Type::Tiny, which emulates Moose’s design. But that’s a blog post for another day.

If you’re still reading, I’d appreciate your thoughts on this. Is the extra runtime speed worth the compile time hit and extra dependencies? I’ve been working on reducing the number of deps for Specio and PCC, but I’m not quite willing to go the zero deps route of Type::Tiny yet. That would basically mean copying in several CPAN modules to the Specio distro, which is more or less what Type::Tiny did with Eval::Closure.

I’d also have to either remove 5.8.x support from DateTime or make Specio and PCC support 5.8. The former is tempting but the latter probably isn’t too terribly hard. Patches welcome, of course ;)

If I do decide to move forward with DateTime+PCC, I’ll go slow and do some trial releases of DateTime first, as well as doing dependent module testing for DateTime so as to do my best to avoid breaking things. Please don’t panic. Flames and panic in the comment section will be deleted.

Edit: Also on the Perl subreddit.

It’s not too late to sign up for my Introduction to Moose class at YAPC::NA 2016. This year’s class will take place on Thursday, June 23. I’m excited to be doing this course again. It’s gotten great reviews from past students. Sign up today.

There are lots of other great courses. For the first time ever, I’m also going to be a student. I’m looking forward to attending Damian Conway’s Presentation Aikido course on Friday, June 24.

My Introduction to Moose class is back at YAPC::NA 2016. This year’s class will take place on Thursday, June 23. I’m excited to be doing this course again. It’s gotten great reviews from past students. Sign up today.

And of course, there are tons of other great offerings this year too, including several from the legendary Damian Conway! I already signed up for his Presentation Aikido course on Friday, June 24.

What sort of things can you learn when interviewing someone for a technical position? What questions are useful?

This is a much-discussed and sometimes hotly debated topic in the tech world. I’ve done a fair bit of interviewing for my employer over the past few years. We’ve built an excellent technical team, either because or in spite of the interviews I’ve done.

Here’s my unsubstantiated theory about unstructured interviews and what they’re good for. (My personal opinion, not my employer’s!)

First of all, I know all about the research that says that unstructured interviews don’t predict performance for technical positions. I agree 100%. This is why we give candidates some sort of technical homework before scheduling an interview. We expect this to take no more than a few (2-3) hours at most. We review this homework before we decide whether or not to continue with the interview process. I weight this fairly heavily in the process, and I’ve rejected candidates simply based on reviewing their homework submission.

But the unstructured interview is still important. Here are some of the things I think I can learn from the iinterview, and some of the questions I use to figure those things out.

Does the candidate actually want this particular job? Enthusiasm matters. I’m not looking for a cheerleader, but I also don’t want someone who’d be happy with absolutely any job. One question I might ask to get a sense of this would be “What appeals to you about this position?” I can also get a sense of this based on the questions that the candidate asks me.

Is this position a good fit for this particular candidate? I want to make sure that the candidate has a clear understanding of the position, specifically their work duties, time requirements, expectations, etc. Some questions along these lines would be “What are the most important things for you in a position?” and “What do you need from the rest of the company in order to do your best work?” If someone says that the most important thing is working on mobile apps written in Haskell and we don’t do that (because no one does), then that’s a good reason not to hire them!

Can they telecommute and/or work with telecommuters effectively? Most of our team is remote, so even if a team member works at the office, they are effectively telecommuting. I just want to be sure that they either have experience with telecommuting or some idea of what this entails. If they haven’t done it before, do they have a work space that they can use? Do they have a plan for dealing with the challenges of working at home?

Can the candidate communicate effectively? Are they pleasant to talk to? Some people are not good communicators. Sometimes two people just don’t mesh well and rub each other the wrong way. Maybe I interview someone and just don’t enjoy talking to them. This doesn’t mean they’re a bad person or bad at their job, but it does mean that we shouldn’t work together.

Can they communicate effectively about technical topics? One question I’ve asked people is simply “What is OO?” There are many right answers to this question, but the real goal is to make sure that they can communicate about a technical topic clearly. If someone doesn’t know any of the terminology around OO (“class”, “instance”, “attribute/field/slot”, etc.) then it’s going to be hard to provide code review on OO code. Note that some people can write code well and still not be able to communicate about it.

Can they communicate effectively with non-technical people? For most positions I’ve hired for, our expectation is that the person being hired will be working not just with the engineering team, but also with a product manager, sales and marketing, support, etc. I want to make sure that the candidate can communicate with these people. We do a short role-playing exercise where one of the interviewers pretends to be a non-technical customer asking them to build a specific product. Then we have them ask the interviewer questions to get a sense of the product requirements, constraints, etc.

Do they care about technical stuff? I want people who actually have opinions about doing their job well. I may ask them what tools they like and dislike, what they’d change about the tools they’ve used, etc. On the flip side, I don’t want someone who’s dogmatically opinionated either. (Or at the least, no more dogmatically opinionated than I am.)

Do they ask good questions? I expect candidates to come to the interview with questions of their own. If they don’t ask any, that’s a red flag that makes me wonder if they don’t care about their work environment, work process, etc. This does not make for an engaged coworker.

There are also things I don’t look for in an interview.

Cultural fit. What is this? I have no idea. It’s way too broad and an easy excuse to simply reject people for not being enough like me.

Code-writing skill. That was already covered by the homework. I never ask specific technical questions unless I have a good reason to believe that the candidate knows about the topic. I might think that based on something specific in their cover letter or resume, or better yet based on their homework or FOSS work.

Will they work out long term? You can’t really answer this confidently from a short interview. What you can do is screen for obvious red flags that indicate that this person will not work well with the team you have, or that they will not like this position. In the latter case, I hope that this is a mutual decision. In my own job searches, I’ve had interviews where I came out knowing I didn’t want the job, and I consider those interviews quite successful!

Will any of this guarantee that I will always find the best people? No, obviously not. Hiring is a difficult thing to do. But if you consider the goals of the interview carefully, you can make the best use of that time to improve your chances of finding the right people.