I’ve been playing with Dist::Zilla lately, and while I like it, I’ve also realized there’s some perhaps not-so-obvious cons to using it as well. There’s also some obvious cons, and some obvious pros.

In talking about cons, there are really two categories. Some of the cons are essential to the design of dzil. others are non-essential, and can easily be fixed in the future, given a sufficient supply of round tuits. Obviously, the essential cons are most important.

Let’s get the non-essential ones out of the way. The obvious one is that the docs are pretty minimal right now. I found that to really get what I wanted, I had to mix together cargo-culting and source diving. I still don’t understand how the heck I can make use of Pod::Weaver.

A closely related problem is that while there are lots of dzil plugins, they too are mostly poorly documented, and they’re also insufficiently flexible. A good example, is the Dist::Zilla::Plugin::PodSpellingTests plugin. Spell checking your pod is great, and I’d love to automate it as much as I can. However, if you’re doing spell checking you must include a custom dictionary that includes things like your name.

This plugin adds a wordlist that the author created in the form of a CPAN module. That’s not very useful when the wordlist module doesn’t a word you want to whitelist. There’s no way to provide an alternate module. Of course, the real problem is that this is a terrible interface. I don’t want to release a new distro every time I add a word to my wordlist. The right way to do this is to look for a .pod-spelling file in the distro root.

Ultimately, I skipped this plugin and created POD spelling test “by hand”.

Let’s not pick on Marcel too much. My own dzil Mercurial plugin is pretty minimal too. It works for me, but may not satisfy anyone else.

Also, dzil is slow. It uses Moose for a CLI app, which is a known-slow combination. Someone should improve Moose startup speed ;)

But as I said, these are non-essential problems, and all entirely fixable.

So what can’t be fixed?

Ultimately, using dzil to its utmost means creating a sharp divide between the source repository and released code. Dzil is in part a big ol’ pre-processor. It does things like add a $VERSION to each module, add boilerplate to the POD, generate a LICENSE file, etc.

Of course, Perl module authors are already accustomed to this. I’m sure that most authors don’t check their META.yml files into source control and edit them by hand. Instead, they’re updated as part of the release process. Dzil just takes this several steps further.

However, some of these steps can be particularly problematic. If you allow dzil to add the $VERSION line, that means that when you use the distro’s modules directly from the lib directory, they have no version. This can be a problem if you’re trying to test some other module against the source repo, and that other module has a minimum version requirement.

Similarly, when you run tests with prove, you’re testing something that isn’t quite what gets released. Don’t worry too much; when you dzil release, it runs the tests against the post-processed code, so you’re not likely to incur bugs this way.

You can choose to not use the $VERSION-inserting plugin, and maintain the $VERSION manually, and dzil still has lots of other useful features. Nonetheless, this sort of issue is likely to crop up with other plugins.

So what are the pros? Ultimately, it makes maintaining modules easier. The less non-essential work I have to do in order to make a new release, the better. Also, some of the plugins do things to ensure that my releases are not broken, like checking for an update to Changes that matches the current module version, or ensuring that I have pod syntax tests as part of the release.

For someone like me, who has dozens of modules on CPAN, these time savings really add up.

Overall, I’m pretty happy with dzil, and I consider the eliminated drudgery a win, despite the hassles. I’m hoping that this entry will give people a better idea of what they’re getting into if they explore Dist::Zilla.

I also look forward to rjbs finally finishing the much-discussed configuration system overhaul so he can finally write some damn docs ;)

I released the first version of Text::TOC, so now we can revisit my earlier design in light of an actual implementation.

From a high level, what’s released is pretty similar to what I thought I would release. Here’s what I said the high level process looked like:

  • Process one or more documents for “interesting” nodes.
  • Assemble all the nodes into a table of contents.
  • Annotate the source documents with anchors as needed.
  • Produce a complete table of contents in the specified output format.

This is more or less exactly what the released code does. However, I was also wrong in some cases. I said that “adding anchors and generating a table of contents will also be roles”. In fact, this became one role, Text::TOC::Role::OutputHandler.

The output handler is responsible for iterating over the nodes that were deemed interesting. It adds anchors to the source document via the nodes themselves, which are assumed to somehow connect back to the source document. In the HTML case, I’m using HTML::DOM, so given any node in the document, I can alter the source document in place. At the same time, as it iterates over the nodes, the output handler generates a table of contents.

I still might go back and split these responsibilities up, but for now I wanted to get something released rather than futzing around to find the perfect architecture. Even if I do split them up, the OutputHandler abstraction is useful. In the future an OutputHandler could just delegate to an AnchorInserter and TOCBuilder.

I got some other parts right too. I said …

Different types of source documents will produce different types of nodes. For an HTML document, the node contents will probably be a DOM fragment representing the content of a given tag.

That is exactly how the released code works.

I also said that finding “interesting” nodes would be a role. It is, and in the HTML implementation there are sane defaults for single- and multi-document tables of contents.

I planned to have an API for managing the formatting of the TOC, but I punted on that for now. Your current choices are unordered or ordered lists. This is good enough for my needs, and therefore good enough for a first release.

Finally, the shortcut API I proposed was a bit off. I eventually realized that the key decision is whether we’re making a single- versus multi-document table of contents. That decision determines what is the sane default for a node filter, and a sane link generation strategy. In the multi-doc case you’ll always have to provide your own link generator, since I can’t know your URI space.

I also punted entirely on embedding the table of contents in the output document. You can do that yourself for now. The code is on CPAN or in my mercurial repo, so feel free to take a closer look. I hope this will be of use to others as well. I don’t know if there will ever be interest in working with non-HTML documents, but even as it is I think it’s more useful than the other HTML TOC tools that previously existed on CPAN.

A while ago, I wrote an entry on the idea of breaking problems down as a strategy for building good tools.

Today, I started writing a new module, Text::TOC. The goal is to create a tool for generating a table of contents from one or more documents. I’m going to write up my initial design thoughts as a “how-to” on problem break down.

First, a little background. I’ve already looked at some relevant modules on CPAN. Both HTML::Toc and HTML::GenToc have awkward and/or insufficiently powerful APIs. Their internals are also nothing to write home about, so I ruled out patching them. At a certain point, I just can’t stomach wading through a bad design, even if that might get me to my goal quicker.

I started this project wanting to generate a table of contents for an HTML document, but I quickly realized that with a little extra work, I could make a table of contents tool that worked for different document formats. A table of contents is a pretty generic concept, so there’s no reason not to generalize it.

The ultimate product will also include a shortcut module to facilitate extremely common cases for HTML documents.

Producing a set of low-level components, and then tying them together in convenience modules makes for very good tools. With this approach, if I can build one convenience module, I can build five. Just as importantly, it will also be possible to handle more complicated cases. I believe in following the Perl spirit of making simple tasks simple, and complicated tasks possible. Too many CPAN modules solve one specific problem case at the expense of locking the code into a single-use API.

Roles Rock

I started by thinking about the process that goes into generating a table of contents:

  • Process one or more documents for “interesting” nodes.
  • Assemble all the nodes into a table of contents.
  • Annotate the source documents with anchors as needed.
  • Produce a complete table of contents in the specified output format.

This is all very generic. What kind of nodes? What makes a node interesting? What do anchors look like? What does the table of contents look like for a given format?

This project will make extensive use of roles in its API, and this list of steps gives me a good idea of what those roles will be. I’ll create a role for nodes. There will also roles for input and output handling. Anything that does input processing will also do input filtering to find “interesting” nodes. This filtering is also a role.

Finally, adding anchors and generating a table of contents will also be roles.

You’ll notice that I haven’t talked about anchor names. For now, I’m going to hardcode an algorithm to generate these based on combining the anchor’s display text with a unique id. There’s no need to solve every problem up front. Patches will be always be welcome.

What is a Table of Contents?

For this project, I’m going to represent the table of contents as a list of nodes. Each node will consist of a type (“h2”, “h3”, “image”), a link, and the node’s contents.

Different types of source documents will produce different types of nodes. For an HTML document, the node contents will probably be a DOM fragment representing the content of a given tag.

This is a very minimal representation. I want to avoid encoding things like a “level” in the node list itself. Instead, I’ll defer decisions on how to handle this to the output generation stage. This will make it easier to produce different table styles. Of course, there will be a default which handles common node types (heading) in a sane way.

Input Handling

Concrete input handlers will take a document in a given format and find the interesting nodes in that document. As I mentioned earlier, finding “interesting” nodes will be a role. However, since this is something that people will often want to tweak, I want to make sure that providing a custom filter is as easy as possible.

Instead of requiring that people instantiate a concrete class which implements the filtering role, I will define a type coercion from a code reference to an object. Callers of the module can provide a simple code reference as a filter:

Internally, we’ll take the code reference and wrap it with an object which implements the filter role’s API.

Output Handling

There are two distinct output tasks. First, we need to annotate an existing document with anchors, so that the we have something to which we can link the table of contents.

Second, we need to produce the table of contents itself.

It’s tempting to create a single interface that does both, because these tasks both depend on the output format. However, there’s a lot of variation in the way a table of contents can be represented, so I think these will be two separate interfaces.

Another important part of the output interface is the formatting of links in the table of contents, and this will have its own API.

This makes things a little more complicated, but the shortcut modules can gloss over the details in most cases.

The Shortcut API

Now that I have a handle on the low-level components, I want to consider the shortcut API. The shortcut API needs to expose some implementation detail, but not all of it. Understanding what’s most important for users helps me in turn understand exactly how to break down the low-level pieces.

I’m going to assume that most users of this module will be inputting and outputting the same format, so we’ll have a single API setting for the format. I’ll simply encode this in the class name, since the choice of format decides many of the low-level classes.

The shortcut API should support generating a table of contents for either a single document or multiple documents. This affects the generation of links for the table. We also want to support embedding the table in the generated document, at least for the single document case.

Finally, we can offer a few different styles of output for the table of contents. Two obvious choices which come to mind are unordered versus ordered lists.

Given all that, our API might look something like this:

The “single-document” filter will find second- through fourth-level headings. My assumption is that a single document only has one <h1> tag, which is the document’s title. There’s no reason to put this in the table of contents.

If we were generating a table of contents for multiple documents, we would want to include the first-level heading, necessitating a different filter.

Since we’re only linking within a single document, we don’t need to do anything intelligent with the links, we can just use the anchor name directly.

For a multi-document table, I’ll need a code reference that does something smart based on the file name. I’m not sure it’s worthwhile trying to provide a shortcut for this part of the API, since there may not be any common patterns here. Every application has it’s own URI patterns.

Instead, I’ll probably just take a code reference:

This shortcut API isn’t set in stone, but it’s a good start for something useful, and it gives me some good clues about the low-level API.

Writing the Code

Writing this blog entry has been a good way to clarify how this tool should work. Stay tuned for a release of Text::TOC to a CPAN mirror near you.

We’ll see how much of the design survives the fires of implementation.