I’ve been thinking of writing some CPAN API review blog posts. There’s a lot of bad APIs on CPAN. But before I pick on other people’s modules, I thought it would be good to start with something of my own.

The first release of DateTime was in early 2003. There’s 7+ years of mistakes to review, but I’ll just hit the highlights.

Duration and subtraction APIs

In my experience, the single most confusing part of the distribution is DateTime::Duration API. The original API exposed too much of the internals (via the delta_* methods) and not enough of what people actually wanted. Since then, there’s been some band-aids, but the bleeding continues.

Ultimately, I’m not sure how fixable this is. One of the biggest API problems here is that users expect to be able to freely convert between any units in a duration. This isn’t possible, because there’s no fixed conversion between most units (how many days in 3 months?).

This confusion is compounded by the fact that there are several different methods for subtracting one DateTime from another, and the naming for these methods sucks. The delta_md method gives you months and days, delta_ms gives you minutes and seconds, and subtract gives you all units. For extra fail, there is a delta_days method in DateTime and DateTime::Duration, and they’re totally unrelated.

Finally, all of this is exacerbated by the fact that there is no date-only class. For any math where you just need to deal with dates, having a date-only class would obviously make your life easier. There’d simply be less API to understand, and the date-only class would be more likely to do what you mean.

Time Zones by Default

I think DateTime would be much better off if there was one class for floating datetimes, and another class for datetimes with time zones. Mixing the two together in one class makes the internals more complicated, and can lead to confusion when comparing two objects. If there were two classes, we could define explicit conversion methods, but otherwise make them unmixable.

Also, date math without time zones is a lot simpler to understand.

ETOOMANYNAMES

When I first created DateTime, I imitated parts of Time::Piece API, and I’ve since regretted that. Namely, a lot of the methods have multiple names, so we have day_of_month, day, and mday, all of which return the same thing. This is just clutter. Even worse, it means that two pieces of code using DateTime may use different methods to do exactly the same thing.

DefaultLocale()

This is a global, and globals are bad.

Strftime

Nowadays, the CLDR patterns are much more useful than strftime. If I were starting over I’d put strftime and strptime together in an external DateTime::Format::Strxtime library.

Summary

There’s probably other API problems, but I think these are some of the highlights. Feel free to complain in the comments, or more usefully, submit doc patches where appropriate.

My last blog entry, “What Versions of Core Perl Should Module Authors Support?”, generated a lot of discussion in the comments. There were a number of points raised that are worth addressing.

First, in a wild coincidence, RHEL 6 was released the day I wrote the blog entry. It includes Perl 5.10.1, which means that the last major Linux distro still shipping 5.8.x is now a little more up to date. That’s good, but unfortunately 5.10.1 is already fairly out of date, and seven years from now, it will be ridiculously out of date.

One commenter, Stephen, said “but in my world, I get to deal with large enterprise customers that won’t let any non-vendor packages through the door — so we’re pretty much required to support, at a minimum, whatever openSUSE and RHEL decide to ship with.”

(As an aside, if they don’t let in non-vendor packages, why does it matter if the next version of Moose or DateTime doesn’t work that version of Perl? It’s not like there’s going to be a vendor package for new modules.)

I’m assuming that by “my world” he means the business where he works. Obviously, if you’re selling to enterprise customers you need to provide software that works on their platform of choice. It’s not feasible to say “upgrade your base OS”.

But how does that obligate me, as an author of free software, to support those platforms? Those people who want to use the same Perl for seven years are already paying Red Hat to support it. Of course, as chromatic points out, they’re not paying any Perl core developers, so I’m not sure what their support is worth.

If people want the same level of support from me as they get from Red Hat, I can tell you where to send the checks. I am willing to do support for my free software, though no one’s ever asked. If Stephen’s company needs a new version of a module to be tested and supported on an ancient Perl, that seems like a good business case for paying the author of that module for support.

People take free software for granted. I spend a lot of time working on this stuff, and I release it because I want to. That doesn’t come with any sort of support obligation. As it happens, I do spend a lot of time on support (improving docs, fixing bugs, answering questions). But if you want a guarantee, then you need a contract, and you can have one, if you want to pay for it.

Adam Kennedy and Peter Rabbitson both asked what features are in a new Perl that would justify dropping support for 5.8.x. That’s a good question, but first let’s clarify what “dropping support” means.

There are several different levels of “dropping support”. Right now I have a whole bunch of Perls installed (5.8.5, 5.8.8, 5.10.1, 5.12.1, 5.12.2) and I test some of my modules with these Perls. I don’t do this every release, but I do try to test Moose and Class::MOP with at least 5.8.5, 5.8.8, 5.10.1, and 5.12.2 every few releases. Same goes for Datetime, Params::Validate, and some others.

The most benign way to drop support is to simply stop testing on some older versions. I could still accept patches to make things work on 5.8.x, but I wouldn’t do any 5.8.x testing. I think that would probably be acceptable to a lot of people. Basically, it’s just saying “if you care about $VERSION, we’ll let you help support it”. This is worthwhile to me, since just remembering to test all those Perls is a hassle, and it slows down releases (especially when I have to install a bunch of updated prereqs on each Perl).

Once you start actively using new features from new Perls, things get more complicated. In some cases, there are modules on CPAN to bridge the gap. For example, the MRO::Compat module takes the mro feature added in 5.10.0 and backports it to earlier versions of Perl.

If such a thing exists, it’s hard to justify not using it. If it doesn’t exist and you can write it, the same goes. But some features just cannot be easily backported, like // or named regex captures.

So what’s new in Perl since the 5.8 series? Well, first, let’s remember that there’s more to Perl than Perl-level features. The XS API has changed a lot since Perl 5.8.x. Writing XS code that accomodates older Perls can be painful to impossible, depending on what you want to do. If you’re lucky it just means diving into the source of a newer Perl and figuring out what a new function does, then writing a compatibility shim for older Perls. That’s what Devel::PPPort is for, but it doesn’t cover everything.

The list of what’s new in each major Perl release is really quite long. As a random sample, 5.10.0 included say(), given/when, recursive regex patterns, named regex captures, the _ prototype, UNIVERSAL::DOES, Unicode 5.0.0, and //.

With 5.12.0, we got the package Foo 1.2 syntax, Unicode 5.2.0, lots of Unicode improvements, y2038-safety in core, pluggable keywords, each on arrays, and overloading for qr.

And with 5.14.0, we’ll get $0 assignment fixes on Linux, optimizations for shift() without arguments, given/when will return a value, my $new = $old =~ s/cat/dog/r (yay!), package Foo 1.2 { ... }, and hooks into the core parser API.

It’s hard for me to say that any one of these features is absolutely vital for any module I maintain, but they do all add up to make Perl easier and more fun to use.

Some of those features will be vital for some authors. In particular, the pluggable keywords and parser API will make it much easier to write modules which manipulate core syntax, like MooseX::Declare. I suspect it may also make some impossible things possible.

That “more fun” part is pretty important. Remember, free software is free. I do this because I enjoy it. Keeping code working on old versions of Perl is not all that much fun. If it’s vitally important to your business, put your money where your mouth is.

The new Perl 5 core release schedule raises some interesting questions for Perl module authors. In the past, major version releases of Perl were unpredictable. There were approximately two years from 5.005 to 5.6.0, then another two years to 5.8.0. After that, it took a whopping five years til 5.10.0, and then about 2.5 years til 5.12.0.

However, that’s all about to change. The Perl 5 core developers have moved to a timeboxed release plan, and there will be a new major version of Perl once per year. Surprisingly, this doesn’t seem to mean that each release has fewer changes. Instead, the fact that you can do work on the Perl core and see it released on a predictable schedule seems to have invigorated Perl core development. Perl 5.14 will have a lot of interesting new features.

In the past, I’ve supported 2-3 major releases of Perl for my modules. For a long time, that meant 5.6.x, 5.8.x, and 5.10.x. Since there was such a long gap from 5.8.0 to 5.10.0, I started dropping support for 5.6.x before 5.12.0 came out, and I didn’t hear too much kicking and screaming.

But with once-per-year major version updates, supporting only 2-3 major Perl versions may be a problem. Perl’s newly invigorated release schedule clashes with the support schedules of enterprise Linux distributions like RHEL in a big way. RHEL 5 was released in early 2007, and will be supported until 2014. RHEL 5.5 is still using Perl 5.8.8. RHEL 6, due out some time in 2011, will upgrade to Perl 5.10.1, meaning Red Hat is committing to supporting that Perl version until 2018, long after the Perl core developers have stopped supporting it.

What’s a module author like myself to do?

The Moose core team discussed this recently, and we had some tentative conclusions.

First, dropping support for 5.8.x is a special case, because 5.8.x was the newest major version of Perl for a really long time (five years), longer than other major version of Perl. That means 5.8.x was the only Perl available in every major Linux distro (and probably BSD too) until very recently.

We probably want to wait until all the major distros have shipped with Perl > 5.8.x before we drop support for it. Debian and Ubuntu are already on 5.10, and RHEL 6 should be out soon. OpenSUSE lists 5.12.1 (wow, good job, OpenSUSE), it looks like FreeBSD has also moved to 5.10.1. We’re making progress on the “drop 5.8.x” front. It seems reasonable to drop support for 5.8.x sometime in 2011 or 2012.

Since dropping 5.8.x support is a really big deal, we’ll probably want to make a big deal out of it for Moose too, possibly doing at the same time as we bump Moose’s major version number. We also want to have plenty of lead time, at least 6-12 months.

It seems unlikely that any future version of Perl 5 will ever get as solidly entrenched as 5.8.x. Given that Perl 5 is releasing a new major version each year, we can hope that end users will become more accustomed to upgrading their Perl 5 core installs on a regular basis. Realistically, I think distros and end users will probably end up skipping at least one major version between upgrades.

The Perl core team is only “committed” (I use this word loosely) to providing critical security patches for three years worth of Perl releases. In the future, that means 3 major versions of Perl at a time. That’s probably a good guideline for module authors too.

Module authors, especially authors of widely used modules, should start thinking about this soon. Perl 5.14 should be out in early 2011, making it the second major version on the new release schedule. Once 5.16 comes out in 2012, I think we’ll officially be in a new era of Perl 5. I hope that Moose and other major Perl modules will have a clearly defined policy for Perl version support before 5.16 comes out.

I often wish that I had an infinite supply of time, motivation, and skill. If I did, I bet I could get a lot done! My todo programming (and programming-related) todo list includes so many items that I’m quite sure I’ll never get to most of them.

Here’s my current list, though I’m probably missing some stuff, in no particular order …

CPAN Search $NEXT

I’ve been thinking about this for quite some time, but I haven’t really done much with it. I played with CPANHQ, which had some promise, but has stalled.

My wanted list for a new search.cpan is really long, and includes:

  • Open source. It’s ridiculous that a key piece of the Perl community infrastructure is closed.
  • Modern Perl, probably Catalyst, Moose, and DBIC, to make it as easy as possible for the whole community to contribute.
  • Modern look and feel. The current site is usable, but not beautiful.
  • Excellent full text search. The current search is not bad, but it could be better.
  • Author-specified documentation ordering. The Moose Manual docs should be listed first on the Moose page, for example.
  • Easy to find and analyze dependency information. Basically, I’d love to take the information from http://deps.cpantesters.org/ and what used to be on the now-defunct CPAN Kwalitee site and borg that.
  • Similarly, I’d like to borg CPAN ratings, AnnoCPAN, etc. All the CPAN information should be in one place with a spiffy UI.
  • I’ve long wanted some sort of “web of trust” system for CPAN. A CPAN user would mark authors and/or distributions as trusted. We’d take the graph of trust relationships and try to figure out which authors and modules are most trusted. Trust here would be some combination of good code, good docs, responsive author, whatever. The idea is to organically highlight the best of CPAN, and in particular help people discover the best modules in their class. I think this would be really useful for new users, and a lot more useful than the current CPAN rating system.
  • Incorporate all of BackPAN, just cause.
  • A million ideas that other people will have.

This is a huge project, and while I think it would be useful, the current site works well enough that it’s not exactly urgent.

Full CLDR in Perl

I really want to make the full set of CLDR data available in Perl. This would greatly improve DateTime::Locale, and would be generally useful for lots of other things. There are two approaches, one is to write a Perl binding to the ICU4C library, the other is to parse the raw data files and generate Perl modules. DateTime::Locale is currently doing the latter, but not terribly well. A C library binding would be easier, but then requires the end user to have the ICU C library installed.

Either way, this is a metric frakton of work.

DateTime V2.0

I’d really like to rewrite large chunks of DateTime.pm and the DateTime suite using modern tools like Moose. I’d also like to fix up all of the many stupid API decisions I made over the past seven years or so. Some of this would include …

  • Make a date-only module.
  • Make leap seconds optional. For most uses you don’t care about this, since the exact number of seconds between two points in time is not that important. The code for dealing with leap seconds makes everything more complicated.
  • Using floating point fractional seconds instead of nanoseconds.
  • In particular, make the DateTime::Duration API less crack-tastic.
  • If possible, code it for faster runtime speed.
  • I bet there’s a lot of ideas out there in the community for improvements to DateTime as well.

Make DateTime::TimeZone use Zefram’s binary Olson database reader

Instead of parsing the Olson data files ourself and generating Perl code, I want to use the binary Olson data. The compiler that transforms the Olson database text files to binary data already just works. Using the binary data would be a lot less memory-intensive, and probably faster too. Zefram has already packaged all of the binary data as a CPAN distro, so we don’t even have to rely on potentially very out-of-date system-installed databases.

Really, all that’s left to do is make DateTime::TimeZone use Zefram’s code, and to make sure that the DateTime::TimeZone API is fully supported once we switch.

Note to self: Make sure that the binary data works on 32- and 64-bit systems, where “works” means that we can use the data to the limits of Perl’s integer support. I’m pretty sure that Perl can support larger-than-32-bit ints on 32-bit systems using an NV internally, so we should be able to read in 64-bit integers from the binary file.

DateTime for Perl 6

I’ve toyed with working on DateTime for Perl 6 but never gotten very far.

Mason 2.0

Jon Swartz started working on this and I’ve wanted to hack on it too, but am lacking tuits. Jon had a good blog post on What Mason 2.0 would look like a year ago. I suspect he’s suffering from the same tuit shortage I am.

I actually have some code in a Mason2 directory dating back to 2007, where I started working on a new version of the Mason parser.

WYSIWYG Editor in Silki

I’d really like to add a full WYSIWYG editor to Silki. I started doing this with CKEditor a while back, but I gave up. CKEditor is very full-featured, but almost impossible to customize without making a permanent fork. I suspect it would be better to start with a fairly minimal editor (like the YUI richtext editor) and build on top of it.

Extract the HTML to Wiki converter from Silki

Silki contains some pretty useful code for turning HTML into wikitext. This could be genericized into a replacement for HTML::WikiConverter. HTML::WikiConverter is a good idea, but its internal design is wrong. It’s very difficult to add a new syntax, especially if that syntax supports tables.

Generic blog/forum/wiki spam filter system

There are a lot of web services and tools for doing spam checking on user-submitted content. Step one is to write small modules, one per service/tool/algorithm. Step two is to take all of these and incorporate them into a single plugin-based API that ties them together with a weighting system, like SpamAssassin.

I’m actually quite likely to do this, since I really want to make Silki better at spam detection/prevention.

Finish my donor/volunteer management CRM

My animal rights group could really use a nice full-featured, very easy to use CRM. Yes, I know about CiviCRM, but last I looked it failed miserably on the easy to use front, and was missing some key features we needed.

In my dreams, this system would somehow integrate with our bookkeeping, so that every donation in the CRM linked to a bookkeeping entry, and vice versa.

I started working on this in early 2008, and I’m still not close to done.

VegGuide Technical Revamp

Right now the VegGuide code is still using Alzabo, which is really making it hard to work on certain parts of the code. I’d really like to move it over to Fey::ORM. I’d also love to move from MySQL to Postgres while I’m at it.

Rewrite perltidy using PPI

Perltidy is a really useful tool, but it’s internals are a nightmare. It replicates PPI without an actual API.

Once this was done, I could probably make it actually generate my preferred code style consistently. I’d also like to see it capable of accepting formatting plugins, where certain blocks could be formatted differently form the overall style.

… and then make PPI understand Devel::Declare-based syntax extensions

To really make perltidy and similar tools useful, they need to understand syntax-changing modules like MooseX::Declare. I have no idea how one would do this, since the syntax changes are basically injected into the perl parser itself, and PPI is a separate static parser.

Enhance VCI to support commits and create Dist::Zilla::Plugin::VCI

It’s ridiculous that each version control plugin for Dist::Zilla is totally standalone when VCI exists. However, these plugins need to be able to commit and push, and VCI only supports reading at the moment.

Find a way to eliminate the compilation hit from Moose

There’s the stalled (temporarily?) MooseX::Antlers, and we’ve discussed other approaches amongst the Moose core devs. I’d love to actually take one of these approaches and get it working.

Introduction to Object-Oriented Programming (using Moose)

I think there’s a need for a book that introduces OOP concepts, using Moose to illustrate the ideas. This book would be aimed at people who are totally new to OOP and teach them concepts and design principles. I think this could be great for attracting new users to Perl, because Moose is a really amazing tool. If you learn OOP through Moose, imagine how sad it would be to go back to Java afterwards.

Moose Class Day Two

I’ve been encouraged by brian d foy to develop a second day for my Moose class. I have some vague ideas of focusing on best practices and larger design issues as opposed to basic features. I also am toying with the idea of having the class spend a few hours actually writing a small not-entirely-a-toy application and running it against a test suite.

YAPC 2012 in Minneapolis

I’ve started working with Leonard Miller on some preliminaries for the bid. I think we could do a great job of hosting this here.

Most Likely to Succeed

Of all of these projects, the ones I’m most likely to actually get done are …

  • Generic blog/forum/wiki spam filter system – I don’t know that I’ll get to something totally awesome and generic/pluggable, but enough to put some new modules on CPAN and improve Silki’s spam filtering.
  • Moose Class Day Two
  • YAPC 2012 in Minneapolis – surprisingly, I feel like this is one of the most tractable items. It’s a lot of work, but I know exactly what goes into it. Also, this is the only project where I already have a competent, enthusiastic co-conspirator lined up.

The items I most wish I would do are …

  • Finish my donor/volunteer management CRM – I have dreams of turning it into a SaaS business, but I’m finding it hard to motivate myself for some reason.
  • CPAN Search $NEXT – I think this would be great for the Perl community.

I’ve also bounced around the idea of trying to get funding for some of this work via Kickstarter/TPF grants/international ponzi schemes/Soylent Green but I’m not sure if that will ever happen.

I’ve also left out a lot of things not related to programming, including writing a novel, getting back to learning Chinese, learning to play guitar, and running for city council.