<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>House Absolute(ly Pointless)</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/" />
    <link rel="self" type="application/atom+xml" href="http://blog.urth.org/atom.xml" />
    <id>tag:blog.urth.org,2008-08-19://2</id>
    <updated>2010-08-07T23:37:20Z</updated>
    <subtitle>Unsubstantiated Opinions and Meaningless Blather</subtitle>
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type 4.34-en</generator>

<entry>
    <title>Random Notes from YAPC::EU 2010</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/08/random-notes-from-yapceu-2010.html" />
    <id>tag:blog.urth.org,2010://2.144</id>

    <published>2010-08-07T23:23:40Z</published>
    <updated>2010-08-07T23:37:20Z</updated>

    <summary>This was a great conference, and the organizers did a great job. This is my first visit to the EU, and so far I&apos;ve had a great time. Over the last day or so, I&apos;ve had some interesting conversations with...</summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>This was a great conference, and the organizers did a great job. This is my first visit to the EU, and so far I've had a great time.</p>

<p>Over the last day or so, I've had some interesting conversations with people about how we can improve our conferences, and I wanted to write down some notes before I forget these ideas. Apologies in advance for rambling and incoherence. It's 1:30am here in Pisa and I'm beat.</p>

<ul>
<li><p>Shortening the auction - at YAPC::EU this year the auction was done as competition between three teams, UK, US, and EU. Each time had 4 lots to sell, and we competed to see who could raise the most money. Smylers made the excellent point that a competition incentivizes each time to take a long time on each lot to maximize the price. One way around this might be to simply impose a very hard time limit on each lot (2-3 minutes?). Another might be a softer time limit (5 minutes), but to measure the winner based on dollars/euros raised per minute used.</p></li>
<li><p>The high-value/interest auction items should be announced well in advance, so people make sure to reserve money for items that interest them. There would still be room for a surprise item or two, of course.</p></li>
<li><p>Raising prices - I think YAPC is too cheap. We've been at the 100 dollar/euro price point for quite some time. Raising prices just 20% would raise an additional $4000-6000 at a YAPC::NA, which is more than the auction raises. We could still do an auction focused on a very few entertaining items (maybe 3-7 items), just for fun.</p></li>
<li><p>The YAPC::EU schedule started at 10am, which was fantastic. We need to stop trying to pack so much stuff into the conference.</p></li>
</ul>

<p>We discussed a number of ideas for improving the social aspect of the conference. We all agreed that the social aspect of the conference is as valuable (or more) than the technical aspect. Ultimately, I think that people who have a good social experience will feel like the conference was a good value for them.</p>

<p>I suggested seating plans for the sit-down dinner. However, upon further discussion, we seemed to reach the conclusion that a buffet style dinner with <em>less</em> seating might encourage more mingling. I think YAPC::NA 2008 in Chicago was a great example of this. The dinner was in a big game room in the student center, so people ate, drank, bowled, played Wii, rocked out with guitar hero, and generally ended up mingling, rather than just sitting at a table.</p>

<p>Smylers suggested another way to encourage people to approach more people would be a sort of "human scavenger hunt". Instead of silent auctioning off tons of books for very little money, we could offer them as prizes. The hunt would ask people to do things like ...</p>

<ul>
<li>Find a first-time YAPC attendee</li>
<li>Find two people from Europe (or UK, or US, as appropriate)</li>
<li>Find a person with 10+ modules on CPAN</li>
<li>Find an author of a Perl book (and a non-Perl book)</li>
<li>Find someone who has attended at least 5+ YAPCs and workshops</li>
<li>Find someone who learned Perl within the last two years</li>
</ul>

<p>Goals like these would do a good job of encouraging newbies and experienced attendees to interact.</p>

<p>Another idea I had to encourage mingling would be some sort of "speed dating" event. This would have to be broken up into smaller groups, since you can't really have 150 people in one speed dating event. Maybe we could encourage groups of 30 or so to split off and do this. Maybe this could be scheduled as one of the sessions. We could even do this as a plenary session, and split people up based on something arbitrary (value of ACT user_id % 7).</p>

<p>If you had a YAPC idea you're afraid you'll forget, please leave a comment here!</p>
]]>
        

    </content>
</entry>

<entry>
    <title>New Moose Blog</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/07/new-moose-blog.html" />
    <id>tag:blog.urth.org,2010://2.143</id>

    <published>2010-07-25T20:23:43Z</published>
    <updated>2010-07-25T20:24:37Z</updated>

    <summary>The Moose Cabal now has our own blog. We plan to use this as a source of news about Moose development and usage, so add it to your feed reader if you&apos;re interested....</summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>The Moose Cabal now has <a href="http://blog.moose.perl.org">our own blog</a>. We plan to use this as a source of news about Moose development and usage, so add it to your feed reader if you're interested.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Why Not Use Health as an Argument for Veganism?</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/07/why-not-use-health-as-an-argument-for-veganism.html" />
    <id>tag:blog.urth.org,2010://2.140</id>

    <published>2010-07-09T15:23:19Z</published>
    <updated>2010-07-09T15:45:06Z</updated>

    <summary>At Compassionate Action for Animals, we explicitly do not promote veganism using arguments about human health. We are happy to talk about how to be a healthy vegan, but we don&apos;t try to convince people to go vegan for their...</summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="AR-Veg" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>At <a href="http://www.exploreveg.org">Compassionate Action for Animals</a>, we explicitly <em>do not</em> promote veganism using arguments about human health. We are happy to talk about how to be a healthy vegan, but we don't try to convince people to go vegan for their own health.</p>

<p>Some people find this odd. Isn't veganism obviously the healthiest diet? Why wouldn't we use such a powerful argument? Shouldn't we make the best case we can for veganism?</p>

<p>I came across a <a href="http://rawfoodsos.com/2010/07/07/the-china-study-fact-or-fallac/">blog post titled "The China Study: Fact or Fallacy?"</a> that reminded me so well why we don't engage in this argument.</p>

<p>Go ahead, take a moment to read (or at least skim) that blog post.</p>

<p>Are you back? Great.</p>

<p><a href="http://www.thechinastudy.com/"><em>The China Study</em></a> was big news in the animal rights world when the book first came out. I haven't read it, but from what I've heard it basically says "go (mostly?) vegan". Wow, a whole book backed by lots of data telling people that veganism is the way to go! How exciting!</p>

<p>That blog post a perfect illustration of why this <em>isn't</em> exciting. The blog post contains 9,000 words of statistical analysis, complete with tables, charts, and more. In the end, the author of the post concludes that <em>The China Study</em> is extremely flawed.</p>

<p>Is she right? Who the f*ck knows?</p>

<p>And that's the real problem. It is incredibly difficult for someone without expertise to assess claims about health. How do I know if the blog post author has any credibility? For that matter, how do I know if T. Colin Campbell (author of <em>The China Study</em>) has any credibility? I am not a biologist, epidemiologist, statistician, or dietitian. That blog post sure has a lot of numbers and charts, though! I bet <em>The China Study</em> has some too.</p>

<p>It's trivial to find health arguments for dozens of radically different diets (vegan, Atkins, paleo, raw, and more). If I, as an animal rights activist, start making claims about human health, why should anyone listen to me? There are lots of people with better credentials ready to disagree with me. I can cite sources, but so can others. Without a <em>lot</em> of independent research, it's very difficult for a layperson to figure out the truth, and that assumes there <em>is</em> one truth to figure out. Scientific research is full of contradictions, especially in a field as complex as diet and human health.</p>

<p>Health arguments are a distraction from the real key issue, animal suffering. Animal suffering in factory farms is undeniable and easily proved. It doesn't take a Ph.D. to understand that being crammed in a tiny cage unable to move is torture. Few people in the general public will argue the opposite. An argument based on animal suffering appeals to the fundamental empathy all of us possess, and doesn't require statistics or studies to suport it.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Is Anyone Using Silki?</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/07/is-anyone-using-silki.html" />
    <id>tag:blog.urth.org,2010://2.139</id>

    <published>2010-07-03T17:20:36Z</published>
    <updated>2010-07-03T17:52:39Z</updated>

    <summary>I realized that the migrations I wrote were very buggy. Now I&apos;ve written a test system to help me test future migrations, but the existing releases are problematic. I can create a set of schema changes to fixup a schema...</summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>I realized that the migrations I wrote were very buggy. Now I've written a test system to help me test future migrations, but the existing releases are problematic.</p>

<p>I can create a set of schema changes to fixup a schema which has been migrated, but the changes will have to be applied manually.</p>

<p>Note that if you're comfortable wiping your existing schema because you're just playing with Silki then this is a non-issue.</p>

<p>Please <a href="mailto:autarch@urth.org">email me</a> if you are using Silki.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Do TPF Grants De-motivate?</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/07/do-tpf-grants-de-motivate.html" />
    <id>tag:blog.urth.org,2010://2.138</id>

    <published>2010-07-01T22:19:33Z</published>
    <updated>2010-07-01T22:58:45Z</updated>

    <summary>There&apos;s been a lot of discussion about the role of TPF lately, both at YAPC and on blogs. The most recent discussion is in the comments of a recent blog post by Gabor Szabo asking people to weight in on...</summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>There's been a lot of discussion about the role of TPF lately, both at YAPC and on blogs. The most recent discussion is in the comments of a <a href="http://blogs.perl.org/users/gabor_szabo/2010/07/what-would-you-like-tpf-to-do.html">recent blog post by Gabor Szabo</a> asking people to weight in on what TPF should be doing.</p>

<p>In the comments, Casey West says:</p>

<blockquote>
  <p>It's a striking sign that The Perl Foundation is expected to <strong>pay for open source contributors</strong></p>

<p>...</p>

<p>Right now TPF is using money to <strong>demotivate</strong> the Perl Community! It's killing the Perl [<em>sic</em>].</p>
</blockquote>

<p>This is a bold and, in my opinion, incorrect statement.</p>

<p>Casey is no doubt referring to the well-known research suggesting that payment reduces performance by replacing intrinsic motivation with extrinsic motivation. Let's assume that this research is true for the sake of this blog post.</p>

<p>Does it necessarily follow that TPF grants reduce motivation? I don't think so. There are a number of ways grants can help people get more work done. In fact, I think there are several ways that grants can boost <em>intrinsic</em> motivation.</p>

<h2>Public Promises</h2>

<p>When a grant is approved, the recipient is promising to do something with the community's money. I can't speak for others, but I know that when my grant was approved, I had made a promise to the Perl community to follow through.</p>

<p>My experience with volunteers suggests that people are more likely to follow through when they make a firm commitment to someone. My understanding is that this is also backed up by modern psychological research.</p>

<p>I think this is one reason why regular grant reports are crucial to the grant process. This follow up makes it clear that the community is paying attention to the grant recipient.</p>

<p>The public nature of the grants should motivate the grant recipient. If the recipient <em>doesn't</em> find this motivational, I don't think they should be getting a grant in the first place!</p>

<h2>Validation of Competence</h2>

<p>Getting a grant can be an external validation of one's self-worth. I know that I felt good about the fact that my grant proposal got a lot of public support, and was eventually approved. Effectively, the Perl community agreed that <em>my</em> skills were worth $3,000 of <em>their</em> money.</p>

<p>I can't speak for others, but this sort of ego boost is definitely motivational for me.</p>

<h2>Resume Building</h2>

<p>A successfully completed grant is a nice bit of resume building. How many developers out there have been <em>paid by their peers</em> to work on a project? I make a point of mentioning the Moose docs grant in my bio, and I would hope that this helps sell my Moose class.</p>

<h2>Money = Time</h2>

<p>One big obstacle to getting stuff done is lack of time. This is one area where a grant can help, by effectively allowing a person to take unpaid leave from a job, or a sabbatical from self-employment. In practice, most TPF grants <em>don't</em> do this. The grants program limits grant requests to $3,000, which doesn't compensate for much time off, at least for people living in a large chunk of the world.</p>

<p><a href="http://news.perlfoundation.org/2010/02/grant-proposal-fixing-perl5-co.html">David Mitchell's grants</a> are a good example of a grant that aims to provide time. His current grant pays for 500 hours of his time at $50/hour. This is probably a lot less than he could earn freelancing, but is definitely enough to allow him to live comfortably while working on the grant.</p>

<p>It's hard for me to see how a grant like this could be de-motivating. In this case, the grant isn't about the money per se, it's about freeing up time that would otherwise <em>have to spent on paying work</em>.</p>

<h2>Forcing Me to Plan</h2>

<p>While not directly connected to motivation, I found that the grant proposal process was very useful because it forced me to <em>think about my project</em>. <a href="http://news.perlfoundation.org/2008/11/2008q4_grant_proposal_moose_do.html">My grant proposal</a> was my project plan after the grant was approved, and it gave me a lot of direction for working on the Moose docs.</p>

<p>I imagine that other grant recipients also benefited from going through a planning process. I'm not sure I would've have done as much thinking if I'd written the docs without having to write a proposal first.</p>

<h2>Summary</h2>

<p>In my <a href="/2009/04/moose-docs-grant-wrap-up.html">final grant report for the Moose docs grant</a>, I wrote:</p>

<blockquote>
  <p>I'd like to thank the Perl Foundation again for sponsoring this work. The grant was motivational for me, because this was a huge amount of work. I might have done some of it over time, but I doubt I would have done all or done it nearly as quickly without the grant.</p>
</blockquote>

<p>There are probably other ways that grants affect recipients. I'd love to hear from other grant recipients and/or submitters, either in the comments or on their own blogs.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Walking Through a Real dist.ini</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/06/walking-through-a-real-distini.html" />
    <id>tag:blog.urth.org,2010://2.137</id>

    <published>2010-06-02T19:26:52Z</published>
    <updated>2010-06-02T20:25:49Z</updated>

    <summary>In a comment on my entry about Dist::Zilla pros and cons, Phred says: I&apos;m not clear on the value Dist::Zilla provides other than some versioning auto-incrementing and syntactic sugar for testing. This brings a up a good question. What the...</summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>In a comment on <a href="http://blog.urth.org/2010/05/distzilla-pros-and-cons.html">my entry about Dist::Zilla pros and cons</a>, Phred says:</p>

<blockquote>
  <p>I'm not clear on the value Dist::Zilla provides other than some versioning
auto-incrementing and syntactic sugar for testing.</p>
</blockquote>

<p>This brings a up a good question. What the heck to does dzil do?</p>

<p>Let's walk through a <code>dist.ini</code> file from a real project. I'll use the
<code>dist.ini</code> from my <a href="http://search.cpan.org/dist/Markdent">Markdent
distribution</a>. This should answer the
"what does it do" question quite well.</p>

<p>Here's the whole file:</p>

<pre><code>name    = Markdent
author  = Dave Rolsky &lt;autarch@urth.org&gt;
license = Perl_5
copyright_holder = Dave Rolsky
copyright_year   = 2010

version = 0.13

[@Basic]
[InstallGuide]
[MetaJSON]

[MetaResources]
bugtracker.web    = http://rt.cpan.org/NoAuth/Bugs.html?Dist=Markdent
bugtracker.mailto = bug-markdent@rt.cpan.org
repository.url    = http://hg.urth.org/hg/Markdent
repository.web    = http://hg.urth.org/hg/Markdent
repository.type   = hg

[PodWeaver]

[KwaliteeTests]
[NoTabsTests]
[EOLTests]
[Signature]

[CheckChangeLog]

[Prereq]
Digest::SHA1                   = 0
HTML::Stream                   = 0
List::AllUtils                 = 0
Moose                          = 0.92
MooseX::Params::Validate       = 0.12
MooseX::Role::Parameterized    = 0
MooseX::SemiAffordanceAccessor = 0.05
MooseX::StrictConstructor      = 0.08
MooseX::Types                  = 0.20
namespace::autoclean           = 0.09
Tree::Simple                   = 0
Try::Tiny                      = 0

[Prereq / TestRequires]
File::Slurp                          = 0
Test::Deep                           = 0
Test::Differences                    = 0
Test::Exception                      = 0
Test::More                           = 0.88
Tree::Simple::Visitor::ToNestedArray = 0

[@Mercurial]
</code></pre>

<p>That's a mouthful. Let's step through it in tiny chunks ...</p>

<pre><code>name    = Markdent
author  = Dave Rolsky &lt;autarch@urth.org&gt;
</code></pre>

<p>Setting these does several things. First, these values will end up in the
generated <code>Makefile.PL</code> for the distro. Second, these values are available for
plugins which do POD munging, which we'll look at shortly. In particular, the
author will end up in the every module's POD.</p>

<pre><code>license = Perl_5
</code></pre>

<p>The license setting is used for several things. First, the <code>License</code> plugin
will use it to add a <code>LICENSE</code> file to the distro. Second, it is also
available to POD mungers.</p>

<pre><code>copyright_holder = Dave Rolsky
copyright_year   = 2010
</code></pre>

<p>This is another bit for the POD mungers. Together with the license, we'll end
up with this POD section in each module:</p>

<blockquote>
  <p>=head1 COPYRIGHT AND LICENSE</p>

<p>This software is copyright (c) 2010 by Dave Rolsky.</p>

<p>This is free software; you can redistribute it and/or modify it under
the same terms as the Perl 5 programming language system itself.</p>
</blockquote>

<pre><code>version = 0.13
</code></pre>

<p>Again, this ends up in both my <code>Makefile.PL</code> and my POD.</p>

<pre><code>[@Basic]
</code></pre>

<p>This is a plugin <em>bundle</em>, which is a name for a pre-defined set of
plugins. The <code>Basic</code> bundle contains:</p>

<pre><code>[GatherDir]
[PruneCruft]
[ManifestSkip]
[MetaYAML]
[License]
[Readme]
[ExtraTests]
[ExecDir]
[ShareDir]
[MakeMaker]
[Manifest]
[TestRelease]
[ConfirmRelease]
[UploadToCPAN]
</code></pre>

<p>Whoa, that's a lot. So what do these do?</p>

<pre><code>[GatherDir]
</code></pre>

<p>This tells dzil that it should include all the files in the current directory
(the root of my distro) in the generated distro. I have to include this, or I
won't end up with a distro at all!</p>

<pre><code>[PruneCruft]
</code></pre>

<p>This prunes the gathered files to remove generated files like a <code>Build</code> file,
files that start with a dot (.), etc.</p>

<pre><code>[ManifestSkip]
</code></pre>

<p>This prunes the gathered files based on the contents of a <code>MANIFEST.SKIP</code>
file.</p>

<pre><code>[MetaYAML]
</code></pre>

<p>This generates a <code>META.yml</code> file for the distro (using version 1.4 of the CPAN
Meta format).</p>

<pre><code>[License]
</code></pre>

<p>This plugin generates a <code>LICENSE</code> file, based on the value I set for the
license earlier.</p>

<pre><code>[Readme]
</code></pre>

<p>This one generates a fairly minimal <code>README</code>. Arguably, it's so minimal it's
useless. It could probably be improved ;)</p>

<pre><code>[ExtraTests]
</code></pre>

<p>This looks for tests under my working copy's <code>xt</code> directory. This directory
can contain subdirectories for three different types of "extra" tests, smoke
tests, author tests, and release tests. Each of these directories has its
tests rewritten so that they only run under specific circumstances (based on
environment variables). The tests are rewritten into the <code>t</code> directory.</p>

<p>Typically, I only use <code>xt/release</code>. The tests in the release directory are run
when <code>$ENV{RELEASE_TESTING}</code> is true. The <code>dzil release</code> command makes sure
this is true, so my release tests are run before I do a release, but <em>not</em>
when the module is installed from CPAN. This is perfect for things like POD
tests.</p>

<pre><code>[ExecDir]
</code></pre>

<p>This plugin arranges for a directory's contents to be installed as
executables. Well, actually, it just marks the files as executables, and
another plugin does something useful with them. By default, it looks for a
directory named <code>bin</code>.</p>

<pre><code>[ShareDir]
</code></pre>

<p>Just like <code>ExecDir</code> but for "share" files (non-executable content like
templates, images, etc).</p>

<pre><code>[MakeMaker]
</code></pre>

<p>This generates <code>Makefile.PL</code> for the distro. This plugin is pretty smart, and
generates a file with lots of conditionals so that it does the best job it can
for the version of <code>ExtUtils::MakeMaker</code> that is available on the installing
user's machine. If you've ever written this sort of conditional crap you know
how annoying it is to maintain. Now I don't have to deal with this. As a
bonus, future versions of dzil will account for new versions of EUMM, and I'll
get a better <code>Makefile.PL</code> for free.</p>

<p>This plugin makes use of the information provided by the <code>ExecDir</code> and
<code>ShareDir</code> plugins we saw earlier. It arranges to have these files installed
in the right place via <code>ExtUtils::MakeMaker</code> and <code>File::ShareDir</code>.</p>

<p>There is also a <code>ModuleBuild</code> plugin, but dzil really makes the difference
between the two minimal. Unless I want to integrate a custom <code>Module::Build</code>
subclass, as I did with <a href="http://search.cpan.org/dist/Silki">Silki</a>, there
isn't much difference between EUMM and MB for a project which uses dzil.</p>

<pre><code>[Manifest]
</code></pre>

<p>This plugin creates the <code>MANIFEST</code>.</p>

<pre><code>[TestRelease]
</code></pre>

<p>This runs the tests when I run <code>dzil release</code>.</p>

<pre><code>[ConfirmRelease]
</code></pre>

<p>This prompts me to ask if I'm really sure I want to upload a distro when I run
<code>dzil release</code>.</p>

<pre><code>[UploadToCPAN]
</code></pre>

<p>I bet you can figure out what this does.</p>

<pre><code>[InstallGuide]
</code></pre>

<p>This generates a nice <code>INSTALL</code> file. This plugin is smart. It generates the
right instructions regardless of whether the distro is using EUMM or
<code>Module::Build</code>.</p>

<pre><code>[MetaJSON]
</code></pre>

<p>This generates a <code>META.json</code> file for the distro (using version 2.0 of the
CPAN Meta format).</p>

<pre><code>[MetaResources]
bugtracker.web    = http://rt.cpan.org/NoAuth/Bugs.html?Dist=Markdent
bugtracker.mailto = bug-markdent@rt.cpan.org
repository.url    = http://hg.urth.org/hg/Markdent
repository.web    = http://hg.urth.org/hg/Markdent
repository.type   = hg
</code></pre>

<p>This adds a "resources" section to my <code>META.*</code> files. There are some plugins
on CPAN which will automate this. For the repository settings, the plugin
looks at your working copy to figure out your VCS and remote VCS uris. I might
switch over to these plugins in the future, although I think I'd actually have
to add Mercurial support first.</p>

<pre><code>[PodWeaver]
</code></pre>

<p>I mentioned "POD mungers" several times. <code>Pod::Weaver</code> is a POD rewriting
module which does all sorts of fancy stuff, though I'm using just using a
subset of its default behavior.</p>

<p>First, it looks in my module files for a comment in the form:</p>

<pre><code># ABSTRACT: Some text here
</code></pre>

<p>It uses this to generate the "NAME" section in the POD.</p>

<p>It also inserts "VERSION", "AUTHOR", and "COPYRIGHT AND LICENSE"
sections. <code>Pod::Weaver</code> also lets you do even fancier stuff, like use POD
dialects, add custom sections, etc. I'll be investigating this further in the
future. Really, this module deserves its own blog entry or three.</p>

<pre><code>[KwaliteeTests]
[NoTabsTests]
[EOLTests]
</code></pre>

<p>These add some release tests for various sanity checks. I never need to
customize these tests, so I can let the plugins write them out for me.</p>

<pre><code>[Signature]
</code></pre>

<p>This signs the distro using <code>Module::Signature</code>.</p>

<pre><code>[CheckChangeLog]
</code></pre>

<p>This checks my <code>Changes</code> file to ensure that I have an entry for the version
mentioned in my <code>dist.ini</code>. It could be smarter and check for a <em>date</em> as
well. I'm sure patches are welcome ;)</p>

<pre><code>[Prereq]
...
</code></pre>

<p>This should be obvious. It lists the prerequisites for my distro. There is
also an <code>AutoPrereq</code> module. I don't use this because it generates a lot of
prereqs I think are cruft, like core modules, or multiple modules in the same
distro.</p>

<pre><code>[Prereq / TestRequires]
...
</code></pre>

<p>Again, this is pretty obvious.</p>

<pre><code>[@Mercurial]
</code></pre>

<p>Another plugin bundle. I <a href="http://search.cpan.org/dist/Dist-Zilla-Plugin-Mercurial">wrote some
plugins</a> to automate
some release tasks for a Mercurial-using project.</p>

<p>When I run <code>dzil release</code>, it will check to make sure that my repository is in
a clean state (no changes that haven't yet been checked in). After the release
is uploaded, it tags my working copy and then pushes the changes back to the
remote.</p>

<h3>Summary</h3>

<p>At a high level, dzil does a couple different tasks.</p>

<p>It ensures that support files like the MANIFEST and LICENSE stay up to
date. It also helps improve compatibility by generating a "smart"
<code>Makefile.PL</code>. Basically, it takes distribution metadata and generates all the
files support files I need. Of course, both EUMM and <code>Module::Build</code> already
did that, but dzil takes this several steps further.</p>

<p>The pod munging is similar. It includes standard POD boilerplate that <em>should</em>
be in all my modules, but can be annoying to maintain.</p>

<p>It also helps me include various "sanity tests". Since the plugin writes them
out anew each time I build the distro, I don't have to worry about keeping
them up to date with changes to the testing modules, I just have to update the
plugin.</p>

<p>Besides automating support, dzil also helps automate the actual release
process. It adds some sanity checks like checking the changelog and the
working copy state, and after the release it automates tagging and pushing.</p>

<p>Whereas I previously had to maintain various support files and update them as
the toolchain changed, I can now update my plugins and get the updated support
files "for free" in every distro I maintain. Overall, the number of steps that
go into a release has been hugely reduced, and the possibility of error is
much lower. That means its easier to make a new release, and the release
quality is higher. Faster <em>and</em> better!</p>

<p>At last count, I maintain (for some value of maintain) 66 distros, so anything
I can do to reduce busy work is very welcome!</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Dist::Zilla Pros and Cons</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/05/distzilla-pros-and-cons.html" />
    <id>tag:blog.urth.org,2010://2.136</id>

    <published>2010-05-26T01:49:14Z</published>
    <updated>2010-05-26T02:19:46Z</updated>

    <summary>I&apos;ve been playing with Dist::Zilla lately, and while I like it, I&apos;ve also realized there&apos;s some perhaps not-so-obvious cons to using it as well. There&apos;s also some obvious cons, and some obvious pros. In talking about cons, there are really...</summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="perl" label="Perl" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>I've been playing with <a href="http://dzil.org">Dist::Zilla</a> lately, and while I like
it, I've also realized there's some perhaps not-so-obvious cons to using it as
well. There's also some obvious cons, and some obvious pros.</p>

<p>In talking about cons, there are really two categories. Some of the cons are
essential to the design of dzil. others are non-essential, and can easily be fixed in the
future, given a sufficient supply of round tuits. Obviously, the essential cons are most important.</p>

<p>Let's get the non-essential ones out of the way. The obvious one is that the
docs are pretty minimal right now. I found that to really get what I wanted, I
had to mix together cargo-culting and source diving. I still don't understand
how the heck I can make use of
<a href="http://search.cpan.org/dist/Pod-Weaver">Pod::Weaver</a>.</p>

<p>A closely related problem is that while there are lots of dzil plugins, they
too are mostly poorly documented, and they're also insufficiently flexible. A good
example, is the
<a href="http://search.cpan.org/dist/Dist-Zilla-Plugin-PodSpellingTests">Dist::Zilla::Plugin::PodSpellingTests</a>
plugin. Spell checking your pod is great, and I'd love to automate it as much
as I can. However, if you're doing spell checking you <em>must</em> include a custom
dictionary that includes things like your name.</p>

<p>This plugin adds a wordlist that the author created in the form of a CPAN
module. That's not very useful when the wordlist module doesn't a word you
want to whitelist. There's no way to provide an alternate module. Of course,
the real problem is that this is a terrible interface. I don't want to release
a new distro every time I add a word to my wordlist. The right way to do this
is to look for a .pod-spelling file in the distro root.</p>

<p>Ultimately, I skipped this plugin and created POD spelling test "by hand".</p>

<p>Let's not pick on Marcel too much. My own <a href="http://search.cpan.org/dist/Dist-Zilla-Plugin-Mercurial">dzil Mercurial plugin</a> is pretty minimal too. It works for me, but may not satisfy anyone else.</p>

<p>Also, dzil is slow. It uses Moose for a CLI app, which is a known-slow
combination. Someone should improve Moose startup speed ;)</p>

<p>But as I said, these are non-essential problems, and all entirely fixable.</p>

<p>So what can't be fixed?</p>

<p>Ultimately, using dzil to its utmost means creating a sharp divide between the source
repository and released code. Dzil is in part a big ol' pre-processor. It
does things like add a <code>$VERSION</code> to each module, add boilerplate to the POD,
generate a LICENSE file, etc.</p>

<p>Of course, Perl module authors are already accustomed to this. I'm sure that
most authors don't check their META.yml files into source control and edit
them by hand. Instead, they're updated as part of the release process. Dzil
just takes this several steps further.</p>

<p>However, some of these steps can be particularly problematic. If you allow
dzil to add the <code>$VERSION</code> line, that means that when you use the distro's modules directly from
the <code>lib</code> directory, they have no version. This can be a problem if you're
trying to test some other module against the source repo, and that other
module has a minimum version requirement.</p>

<p>Similarly, when you run tests with <code>prove</code>, you're testing something that
isn't quite what gets released. Don't worry too much; when you <code>dzil release</code>, it runs
the tests against the post-processed code, so you're not likely to incur bugs
this way.</p>

<p>You <em>can</em> choose to not use the <code>$VERSION</code>-inserting plugin, and maintain the
<code>$VERSION</code> manually, and dzil still has lots of other useful features. Nonetheless, this sort of issue is likely to crop up with other plugins.</p>

<p>So what are the pros? Ultimately, it makes maintaining modules
easier. The less non-essential work I have to do in order to make a new
release, the better. Also, some of the plugins do things to ensure that my
releases are not broken, like checking for an update to Changes that matches
the current module version, or ensuring that I have pod syntax tests as part of the release.</p>

<p>For someone like me, who has dozens of modules on CPAN, these time savings really add up.</p>

<p>Overall, I'm pretty happy with dzil, and I consider the eliminated drudgery a
win, despite the hassles. I'm hoping that this entry will give people a better idea of what they're
getting into if they explore Dist::Zilla.</p>

<p>I also look forward to rjbs finally finishing the much-discussed configuration
system overhaul so he can finally write some damn docs ;)</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Text::TOC - Reality Versus Theory</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/05/texttoc-reality-versus-theory.html" />
    <id>tag:blog.urth.org,2010://2.135</id>

    <published>2010-05-24T21:07:01Z</published>
    <updated>2010-05-24T21:25:55Z</updated>

    <summary>I released the first version of Text::TOC, so now we can revisit my earlier design in light of an actual implementation. From a high level, what&apos;s released is pretty similar to what I thought I would release. Here&apos;s what I...</summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="perl" label="Perl" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>I released the <a href="http://search.cpan.org/dist/Text-TOC">first version of
Text::TOC</a>, so now we can revisit <a href="http://blog.urth.org/2010/05/tool-design-a-table-of-contents-tool.html">my
earlier
design</a>
in light of an actual implementation.</p>

<p>From a high level, what's released is pretty similar to what I <em>thought</em> I
would release. Here's what I said the high level process looked like:</p>

<blockquote>
  <ul>
<li>Process one or more documents for "interesting" nodes.</li>
<li>Assemble all the nodes into a table of contents.</li>
<li>Annotate the source documents with anchors as needed.</li>
<li>Produce a complete table of contents in the specified output format.</li>
</ul>
</blockquote>

<p>This is more or less exactly what the released code does.</p>

<p>However, I was also wrong in some cases. I said that "adding anchors and
generating a table of contents will also be roles". In fact, this became one
role, <code>Text::TOC::Role::OutputHandler</code>.</p>

<p>The output handler is responsible for iterating over the nodes that were
deemed interesting. It adds anchors to the source document <em>via the nodes
themselves</em>, which are assumed to somehow connect back to the source
document. In the HTML case, I'm using HTML::DOM, so given any node in the
document, I can alter the source document in place.</p>

<p>At the same time, as it iterates over the nodes, the output handler generates a
table of contents.</p>

<p>I still might go back and split these responsibilities up, but for now I
wanted to get something released rather than futzing around to find the
perfect architecture. Even if I do split them up, the OutputHandler
abstraction is useful. In the future an OutputHandler could just delegate to
an AnchorInserter and TOCBuilder.</p>

<p>I got some other parts right too. I said ...</p>

<blockquote>
  <p>Different types of source documents will produce different types of nodes. For
an HTML document, the node contents will probably be a DOM fragment
representing the content of a given tag.</p>
</blockquote>

<p>That is exactly how the released code works.</p>

<p>I also said that finding "interesting" nodes would be a role. It is, and in
the HTML implementation there are sane defaults for single- and multi-document
tables of contents.</p>

<p>I planned to have an API for managing the formatting of the TOC, but I punted on that for now. Your current choices are unordered or ordered
lists. This is good enough for my needs, and therefore good enough for
a first release.</p>

<p>Finally, the shortcut API I proposed was a bit off. I eventually realized that
the key decision is whether we're making a single- versus multi-document table
of contents. That decision determines what is the sane default for a node
filter, and a sane link generation strategy. In the multi-doc
case you'll always have to provide your own link generator, since I can't know your URI space.</p>

<p>I also punted entirely on embedding the table of contents in the output
document. You can do that yourself for now.</p>

<p>The code is on CPAN or in <a href="http://hg.urth.org/hg/Text-TOC">my mercurial repo</a>,
so feel free to take a closer look. I hope this will be of use to others as
well. I don't know if there will ever be interest in working with non-HTML
documents, but even as it is I think it's more useful than the other HTML TOC
tools that previously existed on CPAN.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Tool Design - a Table of Contents Tool</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/05/tool-design-a-table-of-contents-tool.html" />
    <id>tag:blog.urth.org,2010://2.134</id>

    <published>2010-05-19T17:21:57Z</published>
    <updated>2010-05-19T21:27:52Z</updated>

    <summary>A while ago, I wrote an entry on the idea of breaking problems down as a strategy for building good tools. Today, I started writing a new module, Text::TOC. The goal is to create a tool for generating a table...</summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="perl" label="Perl" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>A while ago, I wrote an entry on the idea of <a href="/2009/11/want-good-tools-break-your-problems-down.html">breaking problems
down</a> as a strategy
for building good tools.</p>

<p>Today, I started writing a new module, <code>Text::TOC</code>. The goal is to create a
tool for generating a table of contents from one or more documents. I'm going
to write up my initial design thoughts as a "how-to" on problem break down.</p>

<p>First, a little background. I've already looked at some relevant modules on
CPAN. Both <a href="http://search.cpan.org/dist/HTML-Toc">HTML::Toc</a> and
<a href="http://search.cpan.org/dist/HTML-GenToc">HTML::GenToc</a> have awkward and/or
insufficiently powerful APIs. Their internals are also nothing to write home
about, so I ruled out patching them. At a certain point, I just can't stomach
wading through a bad design, even if that might get me to my goal quicker.</p>

<p>I started this project wanting to generate a table of contents for an HTML
document, but I quickly realized that with a little extra work, I could make a
table of contents tool that worked for different document formats. A table of
contents is a pretty generic concept, so there's no reason not to generalize
it.</p>

<p>The ultimate product will also include a shortcut module to facilitate
extremely common cases for HTML documents.</p>

<p>Producing a set of low-level components, and then tying them together in
convenience modules makes for very good tools. With this approach, if I can
build one convenience module, I can build five. Just as importantly, it will
also be possible to handle more complicated cases. I believe in following the
Perl spirit of making simple tasks simple, and complicated tasks possible. Too
many CPAN modules solve one specific problem case at the expense of locking
the code into a single-use API.</p>

<h3>Roles Rock</h3>

<p>I started by thinking about the process that goes into generating a table of
contents:</p>

<ul>
<li>Process one or more documents for "interesting" nodes.</li>
<li>Assemble all the nodes into a table of contents.</li>
<li>Annotate the source documents with anchors as needed.</li>
<li>Produce a complete table of contents in the specified output format.</li>
</ul>

<p>This is all very generic. What kind of nodes? What makes a node interesting?
What do anchors look like? What does the table of contents look like for a
given format?</p>

<p>This project will make extensive use of roles in its API, and this list of
steps gives me a good idea of what those roles will be. I'll create a role for
nodes. There will also roles for input and output handling. Anything that does
input processing will also do input filtering to find "interesting"
nodes. This filtering is also a role.</p>

<p>Finally, adding anchors and generating a table of contents will also be roles.</p>

<p>You'll notice that I haven't talked about anchor names. For now, I'm going to
hardcode an algorithm to generate these based on combining the anchor's
display text with a unique id. There's no need to solve every problem up
front. Patches will be always be welcome.</p>

<h3>What is a Table of Contents?</h3>

<p>For this project, I'm going to represent the table of contents as a list of
nodes. Each node will consist of a type ("h2", "h3", "image"), a link, and the
node's contents.</p>

<p>Different types of source documents will produce different types of nodes. For
an HTML document, the node contents will probably be a DOM fragment
representing the content of a given tag.</p>

<p>This is a very minimal representation. I want to avoid encoding things like a
"level" in the node list itself. Instead, I'll defer decisions on how to
handle this to the output generation stage. This will make it easier to
produce different table styles. Of course, there will be a default which
handles common node types (heading) in a sane way.</p>

<h3>Input Handling</h3>

<p>Concrete input handlers will take a document in a given format and find the
interesting nodes in that document. As I mentioned earlier, finding
"interesting" nodes will be a role. However, since this is something that
people will often want to tweak, I want to make sure that providing a custom
filter is as easy as possible.</p>

<p>Instead of requiring that people instantiate a concrete class which implements
the filtering role, I will define a type coercion from a code reference to an
object. Callers of the module can provide a simple code reference as a filter:</p>

<pre><code>sub {
    my $node = shift;

    return 0 if $node-&gt;className() =~ /\bskip-toc\b/;

    return 1 if $node-&gt;tagName() =~ /H[2-5]/ || $node-&gt;tagName() eq 'IMG';

    return 0;
}
</code></pre>

<p>Internally, we'll take the code reference and wrap it with an object which
implements the filter role's API.</p>

<h3>Output Handling</h3>

<p>There are two distinct output tasks. First, we need to annotate an existing
document with anchors, so that the we have something to which we can link the
table of contents.</p>

<p>Second, we need to produce the table of contents itself.</p>

<p>It's tempting to create a single interface that does both, because these tasks
both depend on the output format. However, there's a lot of variation in the
way a table of contents can be represented, so I think these will be two
separate interfaces.</p>

<p>Another important part of the output interface is the formatting of links in
the table of contents, and this will have its own API.</p>

<p>This makes things a little more complicated, but the shortcut modules can
gloss over the details in most cases.</p>

<h3>The Shortcut API</h3>

<p>Now that I have a handle on the low-level components, I want to consider the
shortcut API. The shortcut API needs to expose <em>some</em> implementation detail,
but not all of it. Understanding what's most important for users helps me in
turn understand exactly how to break down the low-level pieces.</p>

<p>I'm going to assume that <em>most</em> users of this module will be inputting and
outputting the same format, so we'll have a single API setting for the
format. I'll simply encode this in the class name, since the choice of format
decides many of the low-level classes.</p>

<p>The shortcut API should support generating a table of contents for either a
single document or multiple documents. This affects the generation of links
for the table. We also want to support embedding the table in the generated
document, at least for the single document case.</p>

<p>Finally, we can offer a few different styles of output for the table of
contents. Two obvious choices which come to mind are unordered versus ordered
lists.</p>

<p>Given all that, our API might look something like this:</p>

<pre><code>my $generator = Text::TOC::HTML-&gt;new(
    filter         =&gt; 'single-document',
    link_generator =&gt; undef,
    style          =&gt; 'unordered-list',
);

$generator-&gt;add_file($path_to_html);

$generator-&gt;embed_table_of_contents();

for my $file ( $generator-&gt;files() ) {
    open my $fh, '&gt;', $file;
    print {$fh} $generator-&gt;document_for_file($file);
}
</code></pre>

<p>The "single-document" filter will find second- through fourth-level
headings. My assumption is that a single document only has a one &lt;h1&gt;
tag, which is the document's title. There's no reason to put this in the table
of contents.</p>

<p>If we were generating a table of contents for multiple documents, we <em>would</em>
want to include the first-level heading, necessitating a different filter.</p>

<p>Since we're only linking within a single document, we don't need to do
anything intelligent with the links, we can just use the anchor name directly.</p>

<p>For a multi-document table, I'll need a code reference that does something
smart based on the file name. I'm not sure it's worthwhile trying to provide a
shortcut for this part of the API, since there may not be any common patterns
here. Every application has it's own URI patterns.</p>

<p>Instead, I'll probably just take a code reference:</p>

<pre><code>my $link_gen = sub {
    my $file = shift;
    my $anchor = shift;

    return 'file://' . $file-&gt;absolute() . '/#' . $anchor-&gt;name();
};

my $generator = Text::TOC::HTML-&gt;new(
    filter         =&gt; 'multi-document',
    link_generator =&gt; $link_gen,
    style          =&gt; 'unordered-list',
);

$generator-&gt;add_file($_) for @files;

for my $file ( $generator-&gt;files() ) {
    open my $fh, '&gt;', $file;
    print {$fh} $generator-&gt;document_for_file($file);
}
</code></pre>

<p>This shortcut API isn't set in stone, but it's a good start for something
useful, and it gives me some good clues about the low-level API.</p>

<h3>Writing the Code</h3>

<p>Writing this blog entry has been a good way to clarify how this tool should
work. Stay tuned for a release of <a href="http://hg.urth.org/hg/Text-TOC">Text::TOC</a>
to a CPAN mirror near you.</p>

<p>We'll see how much of the design survives the fires of implementation.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>DateTime::Locale Update</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/03/datetimelocale-update.html" />
    <id>tag:blog.urth.org,2010://2.133</id>

    <published>2010-03-29T00:41:44Z</published>
    <updated>2010-03-29T00:44:54Z</updated>

    <summary>In my last entry, I proposed doing away with DateTime::Locale entirely. I&apos;ve since realized that I will want to keep it around as a place to integrate both CLDR and glibc locale data in one unified interface. I&apos;m still going...</summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="perl" label="Perl" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>In my <a href="/2010/03/do-you-use-datetimelocale-directly.html">last entry</a>, I proposed doing away with DateTime::Locale entirely.</p>

<p>I've since realized that I will want to keep it around as a place to integrate both CLDR and glibc locale data in one unified interface. I'm still going to work on my new Locale::CLDR module, but the DateTime::Locale API will probably stick around more or less as-is.</p>

<p>The one thing I will want to get rid of is the custom locale registration system. However, custom locales would still be usable. They would be loadable by id, or you could pass an already-instantiated custom locale object to a DateTime object.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Do You Use DateTime::Locale Directly?</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/03/do-you-use-datetimelocale-directly.html" />
    <id>tag:blog.urth.org,2010://2.132</id>

    <published>2010-03-18T19:01:40Z</published>
    <updated>2010-03-18T19:03:43Z</updated>

    <summary>I&apos;m planning to end-of-life DateTime::Locale sometime in the future, in favor of a new distribution, Locale::CLDR. This new distro will be designed so that it can provide all the info from the CLDR project (eventually), rather than just datetime-related pieces....</summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="perl" label="Perl" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>I'm planning to end-of-life DateTime::Locale sometime in the future, in favor of a new distribution, Locale::CLDR.</p>

<p>This new distro will be designed so that it can provide all the info from the CLDR project (eventually), rather than just datetime-related pieces.</p>

<p>My plan is to have DateTime use Locale::CLDR directly, rather than continue maintaining DateTime::Locale.</p>

<p>To that end, I'm wonder how people are using DateTime::Locale. I'm <em>not</em> interested in people only using it via DateTime.pm. That form of usage will continue to work transparently. You specify a locale for a DateTime.pm object and you get localized output.</p>

<p>All of the information available from DateTime::Locale will be available from Locale::CLDR, although the API will be a little different.</p>

<p>In particular, is anyone out there using custom in-house locales at all?</p>

<p>That would be the biggest potential breakage point, since upgrading DateTime.pm to a version that uses Locale::CLDR will end up making your custom locales invalid.</p>

<p>I'm planning to support some form of custom locales in Locale::CLDR as well, of course.</p>

<p>None of this will happen in the very near future. I still need to get DateTime::Format::Strptime not using DT::Locale first, which is its own painful
project ;)</p>

<p>Please reply in the comments or <a href="mailto:autarch@urth.org">send me email</a>.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Benchmarking Versus Profiling</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/03/benchmarking-versus-profiling.html" />
    <id>tag:blog.urth.org,2010://2.131</id>

    <published>2010-03-03T20:59:47Z</published>
    <updated>2010-03-03T21:14:10Z</updated>

    <summary>First, here&apos;s the tl;dr summary ... Benchmarking is for losers, Profiling rulez! I&apos;ve noticed a couple blog entries in the Planet Perl Iron Man feed discussing which way of stripping whitespace from both ends of a string is fastest. Both...</summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>First, here's the tl;dr summary ... Benchmarking is for losers, Profiling
rulez!</p>

<p>I've noticed <a href="http://blog.laufeyjarson.com/2010/03/stripping-whitespace-from-both-ends-of-a-string/">a
couple</a>
<a href="http://illusori.co.uk/perl/2010/03/03/white_space_trim.html">blog entries</a> in
the Planet Perl Iron Man feed discussing which way of stripping whitespace
from both ends of a string is fastest.</p>

<p>Both of these entries discuss examples of <em>benchmarking</em>. Programmers love
benchmarks. After all, it's a great chance to whip out one's performance-penis
and compare sizes, trying to come up with the fastest algorithm.</p>

<p>Unfortunately, this is pointless posturing. Who cares that one version of a strip-whitespace operation is three times faster than another? The important question is whether the speed <em>matters</em>.</p>

<p>Until you answer that question, all the benchmarking in the world won't help
you, and that brings us to profiling.</p>

<p>Profiling is a lot harder than benchmarking, which may be why people talk
about it less often. Profiling doesn't compare multiple versions of the same
operation, it tells us where the slowest parts of our code base are.</p>

<p>In order to make profiling useful, we need to write code that simulates
typical end user use of the code we're profiling. Then we run that code under
a profiler, and we know what's worth optimizing.</p>

<p>Once we know <em>that</em>, then we can start speeding up our code. At this point,
benchmarking might be handy. If, for example, on some crazy bizarro world, our
program spent a lot of its runtime trimming whitespace from strings, we could
benchmark different approaches, and use the fastest.</p>

<p>Of course, in the real world, this will <em>never</em> be the slowest thing your
program is doing. In most cases, the slowest parts of the program are usually
the parts with IO, such as reading files, talking to a DBMS, or making network
calls. If this isn't the case, we may be operating on a lot of data in memory
with some sort of non-trivial algorithm, and <em>that's</em> the slowest part.</p>

<p>Either way, without profiling, benchmarking is just a pointless distraction.</p>

<p>Of course, I'd be remiss if I didn't point out that Perl has an absolutely
fantastic profiler available these days,
<a href="http://search.cpan.org/dist/Devel-NYTProf">Devel::NYTProf</a>. It actually works (no segfaults!), and produces fantastically useful reports, so use it.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Testing a Database-Intensive Application</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/02/testing-a-database-intensive-application.html" />
    <id>tag:blog.urth.org,2010://2.130</id>

    <published>2010-02-23T16:08:48Z</published>
    <updated>2010-02-23T20:01:59Z</updated>

    <summary>If you&apos;ve been bitten by the testing bug, you&apos;ve surely encountered the problem of testing a database-intensive application. The problem this presents isn&apos;t specific to SQL databases, nor is it just a database problem. Any data-driven application can be hard...</summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>If you've been bitten by the testing bug, you've surely encountered the problem of testing a database-intensive application. The problem this presents isn't specific to SQL databases, nor is it just a database problem. Any data-driven application can be hard to test, regardless of how that data is stored and retrieved.</p>

<p>The problem is that in order to test your code, you need data that at least passably resembles data that the app would work with in reality. With a complex schema, that can be a lot of data spread out across many tables. I often find that trying to test each class in isolation becomes very difficult, since the data is not confined to one class.</p>

<p>For example, the app I'm working on now is a wiki. I'm trying to test the Page class, but that involves interactions with many tables. Pages have revisions, they have links to other pages, to files, and to not-yet-created pages. Pages also belong to a wiki, and are created by a user. To test page creation, I need to already have a wiki to add the page to, and a user to create the page.</p>

<p>There are a various solutions to this problem, all of which suck in different ways.</p>

<p>You can try mocking out the database entirely. I've used <a href="http://search.cpan.org/dist/DBD-Mock">DBD::Mock</a> for this, but I've never been happy with it. DBD::Mock has one of the most difficult to use APIs I've ever encountered. Also, DBD::Mock doesn't really solve the fundamental problem. I <em>still</em> have to seed all the related data for a page. I'd even go so far as to say that DBD::Mock makes things worse. Because inserts don't actually go anywhere, I have to re-seed the mock handle for each test of a <code>SELECT</code>, and since a single method may make multiple <code>SELECT</code> calls, I have to work out in advance what each method will select and seed all the data in the right order!</p>

<p>My experience with DBD::Mock has largely been that the test code becomes so complex and fragile that maintaining it becomes a huge hassle. The test files become so full of setup and seeding that the actual tests are lost.</p>

<p>I wrote <a href="http://search.cpan.org/dist/Fey-ORM-Mock">Fey::ORM::Mock</a> to help deal with this, but it only goes so far. It partially solves the problem with DBD::Mock's API, but I still have to manage the data seeding, and that is still fragile and complicated.</p>

<p>The other option is to just use a real DBMS in your tests. This has the advantage of actually working like the application. It also helps expose bugs in my schema definition, and lets me test triggers, foreign keys, and so on. This approach has several huge downsides, though. I have to manage (re-)creating the schema each time the tests run, and it will be much harder for others to run my tests on their systems. Also, running the tests can be rather slow.</p>

<p>For the <a href="http://hg.urth.org/hg/Silki">app I'm working on</a> I've decided to mostly go the real DBMS route. At least this way the tests will be very useful to <em>me</em>, and anyone else seriously hacking on the application. I can isolate the minimal data seeding in a helper module, and the test files themselves will be largely free of cruft. Making it easier to write tests also means that I'll write more of them. When I was using DBD::Mock, I found myself avoiding testing simply because it was such a hassle!</p>

<p>Some people might want to point out fixtures as a solution. I know about those, and that's basically what I'm using now, except that there's only one fixture for now, a minimally populated database. And of course, fixtures still don't fix the problems that come with the tests needing to talk to a real DBMS.</p>

<p>I <em>am</em> going to make sure that tests which don't hit the database at all can be run without connecting to a DBMS. That way, at least a subset of the tests can be run everywhere.</p>

<p>Are there any better solutions? I often feel like programming involves spending an inordinate amount of time solving non-essential problems. Where's my silver bullet?</p>
]]>
        

    </content>
</entry>

<entry>
    <title>The Purpose of Automated Tests</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/02/the-purpose-of-automated-tests.html" />
    <id>tag:blog.urth.org,2010://2.129</id>

    <published>2010-02-16T16:54:53Z</published>
    <updated>2010-02-17T14:21:47Z</updated>

    <summary><![CDATA[Recently, there was a question on stackoverflow that asked whether or not one should test that Moose generates accessors correctly. Here's an example class: package Process; use Moose; has pid =&gt; ( is =&gt; 'ro', isa =&gt; 'Int', required =&gt;...]]></summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>Recently, there was a <a href="http://stackoverflow.com/questions/2269478/how-much-do-i-need-to-test-moose-and-moosexfollowpbp-generated-methods/2270518">question on
stackoverflow</a>
that asked whether or not one should test that Moose generates accessors
correctly.</p>

<p>Here's an example class:</p>

<pre><code>package Process;

use Moose;

has pid =&gt; (
    is       =&gt; 'ro',
    isa      =&gt; 'Int',
    required =&gt; 1,
);

has stdout =&gt; (
    is  =&gt; 'rw',
    isa =&gt; 'FileHandle',
);
</code></pre>

<p>Given that class definition, is there any value to writing tests like this?</p>

<pre><code>can_ok( Process, 'new' );
can_ok( Process, 'pid' );
can_ok( Process, 'stdout' );

throws_ok { Process-&gt;new() } qr/.../, 'Process requires a pid';
</code></pre>

<p>Let's look at why automated tests are useful.</p>

<p>First, they give us some assurance that the code we wrote does what we expect.</p>

<p>Second, tests protect us from breaking code as we change it. As we refactor,
fix bugs, or add new features, we want to make sure that all the existing code
continues to work.</p>

<p>Third, the tests can provide some hints to future readers of our code about
the APIs of the code base.</p>

<p>So back to our original question, do we need to test Moose-generated code?</p>

<p>The tests seen above add absolutely nothing that isn't already tested by Moose
itself.</p>

<p>If the tests don't test anything new, then they can't be giving us any
assurance about our code. Instead, they're giving us assurance about Moose
itself.</p>

<p>Let's assume that Moose is itself well-tested. If it isn't, why are you using
it? There is no point in adopting a dependency on fragile code that you don't
trust. If you want to improve Moose's reliability, the way to do that is by
<em>working on Moose itself</em>, not by testing Moose in your application's test
suite.</p>

<p>Do these tests protect us from breaking code? Not really. If we change the
Process class so that it no longer has the <code>stdout</code> attribute, the test will
fail. But if we made that change, surely it was intentional. So now our tests
are failing because we made an intentional change.</p>

<p>But what if other code in our code base expects the <code>stdout</code> attribute to
exist? As long as that code is tested, we will find this problem quickly
enough. If the <code>stdout</code> attribute is only ever referenced in the test up
above, then what purpose does it serve?</p>

<p>Finally, the test above provides no guidance to future readers. The code in
<code>Process</code> package provides more documentation than the test code, and if the
module also has POD, that will provide even more documentation. The test
doesn't show how the code is <em>used</em>, it just provides another way to describe
what the module <em>is</em>, a way that's inferior to the Moose-based declarations or
POD.</p>

<p>However, don't confuse the tests above with testing code that <em>you</em> write. For
example, if you create a new type with a custom constraint and coercion, you
should definitely test that type. The Moose test suite obviously doesn't test
your specific type, it just tests that new types can be created.</p>

<p>So the answer is no, don't bother with tests like the ones above. Test new
code you create, not Moose is doing what you asked it to do.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Benchmarking MooseX::Method::Signatures</title>
    <link rel="alternate" type="text/html" href="http://blog.urth.org/2010/02/benchmarking-moosexmethodsignatures.html" />
    <id>tag:blog.urth.org,2010://2.128</id>

    <published>2010-02-09T17:45:48Z</published>
    <updated>2010-02-09T18:07:33Z</updated>

    <summary>I&apos;ve been seeing some talk about MooseX::Method::Signatures and its speed. Specifically, Ævar Arnfjörð Bjarmason said says that MXMS is about 4 times slower than a regular method call. He determined this by comparing two different versions of a large program,...</summary>
    <author>
        <name>Dave Rolsky</name>
        <uri>http://blog.urth.org/</uri>
    </author>
    
        <category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://blog.urth.org/">
        <![CDATA[<p>I've been seeing some talk about <a href="http://search.cpan.org/dist/MooseX-Method-Signatures">MooseX::Method::Signatures</a> and its speed. Specifically, <a href="http://blogs.perl.org/users/aevar_arnfjor_bjarmason/2010/02/moosexmethodsignatures-is-really-slow.html">Ævar Arnfjörð Bjarmason said</a> says that MXMS is about 4 times slower than a regular method call. He determined this by comparing two different versions of a large program, Hailo. This is interesting, but I think a more focused benchmark might be useful.</p>

<p>Specifically, I'm interested in comparing MXMS to <em>something else</em> that does similar validation. One of the main selling points of MXMS is its excellent integration of argument type checking, so it makes no sense to compare MXMS to plain old unchecked method calls. Therefore, I <a href="http://blog.urth.org/mxms-vs-mxpv-benchmark">made a benchmark</a> that compares MXMS to <a href="http://search.cpan.org/dist/MooseX-Params-Validate">MooseX::Params::Validate</a>. Both MXMS and MXPV provide argument type checking use Moose types. That should eliminate the cost of doing type checking as a variable. If you don't care about type checking, you really don't need MXMS (or MXPV).</p>

<p>The benchmark has two classes with semantically identical methods doing argument validation. One uses MXMS and the other MXPV. All method calls are wrapped in eval since a validation failure causes an exception. I also tested both success and failure cases. My experience with Params::Validate tells me that there's a big difference in speed between success and failure, and the results bear that out.</p>

<p>Here's what the benchmark came up with:</p>

<pre><code>               Rate   MXMS failure   MXPV failure   MXMS success   MXPV success
MXMS failure  262/s             --           -41%           -81%           -94%
MXPV failure  448/s            71%             --           -68%           -90%
MXMS success 1393/s           431%           211%             --           -69%
MXPV success 4545/s          1634%           915%           226%             --
</code></pre>

<p>First, as I pointed out, there's a big difference between success and failure. I can only assume that throwing an exception is expensive in Perl. Second, the difference between MXMS and MXPV is much greater in the success case. This makes sense if simply throwing an exception is costly.</p>

<p>It seems that in the success case, MXPV is about 3 times faster than MXMS in the success case. I think the success case is most important, since we probably don't expect a lot of validation failures in our production code.</p>

<p><a href="http://blog.urth.org/mxms-vs-mxpv-benchmark">Benchmark code</a></p>
]]>
        

    </content>
</entry>

</feed>
