Benchmarking Versus Profiling

| 8 Comments | No TrackBacks

First, here's the tl;dr summary ... Benchmarking is for losers, Profiling rulez!

I've noticed a couple blog entries in the Planet Perl Iron Man feed discussing which way of stripping whitespace from both ends of a string is fastest.

Both of these entries discuss examples of benchmarking. Programmers love benchmarks. After all, it's a great chance to whip out one's performance-penis and compare sizes, trying to come up with the fastest algorithm.

Unfortunately, this is pointless posturing. Who cares that one version of a strip-whitespace operation is three times faster than another? The important question is whether the speed matters.

Until you answer that question, all the benchmarking in the world won't help you, and that brings us to profiling.

Profiling is a lot harder than benchmarking, which may be why people talk about it less often. Profiling doesn't compare multiple versions of the same operation, it tells us where the slowest parts of our code base are.

In order to make profiling useful, we need to write code that simulates typical end user use of the code we're profiling. Then we run that code under a profiler, and we know what's worth optimizing.

Once we know that, then we can start speeding up our code. At this point, benchmarking might be handy. If, for example, on some crazy bizarro world, our program spent a lot of its runtime trimming whitespace from strings, we could benchmark different approaches, and use the fastest.

Of course, in the real world, this will never be the slowest thing your program is doing. In most cases, the slowest parts of the program are usually the parts with IO, such as reading files, talking to a DBMS, or making network calls. If this isn't the case, we may be operating on a lot of data in memory with some sort of non-trivial algorithm, and that's the slowest part.

Either way, without profiling, benchmarking is just a pointless distraction.

Of course, I'd be remiss if I didn't point out that Perl has an absolutely fantastic profiler available these days, Devel::NYTProf. It actually works (no segfaults!), and produces fantastically useful reports, so use it.

Testing a Database-Intensive Application

| 6 Comments | No TrackBacks

If you've been bitten by the testing bug, you've surely encountered the problem of testing a database-intensive application. The problem this presents isn't specific to SQL databases, nor is it just a database problem. Any data-driven application can be hard to test, regardless of how that data is stored and retrieved.

The problem is that in order to test your code, you need data that at least passably resembles data that the app would work with in reality. With a complex schema, that can be a lot of data spread out across many tables. I often find that trying to test each class in isolation becomes very difficult, since the data is not confined to one class.

For example, the app I'm working on now is a wiki. I'm trying to test the Page class, but that involves interactions with many tables. Pages have revisions, they have links to other pages, to files, and to not-yet-created pages. Pages also belong to a wiki, and are created by a user. To test page creation, I need to already have a wiki to add the page to, and a user to create the page.

There are a various solutions to this problem, all of which suck in different ways.

You can try mocking out the database entirely. I've used DBD::Mock for this, but I've never been happy with it. DBD::Mock has one of the most difficult to use APIs I've ever encountered. Also, DBD::Mock doesn't really solve the fundamental problem. I still have to seed all the related data for a page. I'd even go so far as to say that DBD::Mock makes things worse. Because inserts don't actually go anywhere, I have to re-seed the mock handle for each test of a SELECT, and since a single method may make multiple SELECT calls, I have to work out in advance what each method will select and seed all the data in the right order!

My experience with DBD::Mock has largely been that the test code becomes so complex and fragile that maintaining it becomes a huge hassle. The test files become so full of setup and seeding that the actual tests are lost.

I wrote Fey::ORM::Mock to help deal with this, but it only goes so far. It partially solves the problem with DBD::Mock's API, but I still have to manage the data seeding, and that is still fragile and complicated.

The other option is to just use a real DBMS in your tests. This has the advantage of actually working like the application. It also helps expose bugs in my schema definition, and lets me test triggers, foreign keys, and so on. This approach has several huge downsides, though. I have to manage (re-)creating the schema each time the tests run, and it will be much harder for others to run my tests on their systems. Also, running the tests can be rather slow.

For the app I'm working on I've decided to mostly go the real DBMS route. At least this way the tests will be very useful to me, and anyone else seriously hacking on the application. I can isolate the minimal data seeding in a helper module, and the test files themselves will be largely free of cruft. Making it easier to write tests also means that I'll write more of them. When I was using DBD::Mock, I found myself avoiding testing simply because it was such a hassle!

Some people might want to point out fixtures as a solution. I know about those, and that's basically what I'm using now, except that there's only one fixture for now, a minimally populated database. And of course, fixtures still don't fix the problems that come with the tests needing to talk to a real DBMS.

I am going to make sure that tests which don't hit the database at all can be run without connecting to a DBMS. That way, at least a subset of the tests can be run everywhere.

Are there any better solutions? I often feel like programming involves spending an inordinate amount of time solving non-essential problems. Where's my silver bullet?

The Purpose of Automated Tests

| 6 Comments | No TrackBacks

Recently, there was a question on stackoverflow that asked whether or not one should test that Moose generates accessors correctly.

Here's an example class:

package Process;

use Moose;

has pid => (
    is       => 'ro',
    isa      => 'Int',
    required => 1,
);

has stdout => (
    is  => 'rw',
    isa => 'FileHandle',
);

Given that class definition, is there any value to writing tests like this?

can_ok( Process, 'new' );
can_ok( Process, 'pid' );
can_ok( Process, 'stdout' );

throws_ok { Process->new() } qr/.../, 'Process requires a pid';

Let's look at why automated tests are useful.

First, they give us some assurance that the code we wrote does what we expect.

Second, tests protect us from breaking code as we change it. As we refactor, fix bugs, or add new features, we want to make sure that all the existing code continues to work.

Third, the tests can provide some hints to future readers of our code about the APIs of the code base.

So back to our original question, do we need to test Moose-generated code?

The tests seen above add absolutely nothing that isn't already tested by Moose itself.

If the tests don't test anything new, then they can't be giving us any assurance about our code. Instead, they're giving us assurance about Moose itself.

Let's assume that Moose is itself well-tested. If it isn't, why are you using it? There is no point in adopting a dependency on fragile code that you don't trust. If you want to improve Moose's reliability, the way to do that is by working on Moose itself, not by testing Moose in your application's test suite.

Do these tests protect us from breaking code? Not really. If we change the Process class so that it no longer has the stdout attribute, the test will fail. But if we made that change, surely it was intentional. So now our tests are failing because we made an intentional change.

But what if other code in our code base expects the stdout attribute to exist? As long as that code is tested, we will find this problem quickly enough. If the stdout attribute is only ever referenced in the test up above, then what purpose does it serve?

Finally, the test above provides no guidance to future readers. The code in Process package provides more documentation than the test code, and if the module also has POD, that will provide even more documentation. The test doesn't show how the code is used, it just provides another way to describe what the module is, a way that's inferior to the Moose-based declarations or POD.

However, don't confuse the tests above with testing code that you write. For example, if you create a new type with a custom constraint and coercion, you should definitely test that type. The Moose test suite obviously doesn't test your specific type, it just tests that new types can be created.

So the answer is no, don't bother with tests like the ones above. Test new code you create, not Moose is doing what you asked it to do.

Benchmarking MooseX::Method::Signatures

| No Comments | No TrackBacks

I've been seeing some talk about MooseX::Method::Signatures and its speed. Specifically, Ævar Arnfjörð Bjarmason said says that MXMS is about 4 times slower than a regular method call. He determined this by comparing two different versions of a large program, Hailo. This is interesting, but I think a more focused benchmark might be useful.

Specifically, I'm interested in comparing MXMS to something else that does similar validation. One of the main selling points of MXMS is its excellent integration of argument type checking, so it makes no sense to compare MXMS to plain old unchecked method calls. Therefore, I made a benchmark that compares MXMS to MooseX::Params::Validate. Both MXMS and MXPV provide argument type checking use Moose types. That should eliminate the cost of doing type checking as a variable. If you don't care about type checking, you really don't need MXMS (or MXPV).

The benchmark has two classes with semantically identical methods doing argument validation. One uses MXMS and the other MXPV. All method calls are wrapped in eval since a validation failure causes an exception. I also tested both success and failure cases. My experience with Params::Validate tells me that there's a big difference in speed between success and failure, and the results bear that out.

Here's what the benchmark came up with:

               Rate   MXMS failure   MXPV failure   MXMS success   MXPV success
MXMS failure  262/s             --           -41%           -81%           -94%
MXPV failure  448/s            71%             --           -68%           -90%
MXMS success 1393/s           431%           211%             --           -69%
MXPV success 4545/s          1634%           915%           226%             --

First, as I pointed out, there's a big difference between success and failure. I can only assume that throwing an exception is expensive in Perl. Second, the difference between MXMS and MXPV is much greater in the success case. This makes sense if simply throwing an exception is costly.

It seems that in the success case, MXPV is about 3 times faster than MXMS in the success case. I think the success case is most important, since we probably don't expect a lot of validation failures in our production code.

Benchmark code

Moose Class in Minneapolis - Friday, February 5, 2009

| No Comments | No TrackBacks

I'm doing my one-day Moose class here in Minneapolis again, as part of Frozen Perl. The class is even cheaper this time, as a special deal for the workshop. It's a mere $100 per person!

The class is an interactive course, meaning you bring your laptop and do exercises in between lecture sections. It covers all the basics of Moose, and even gets into some of the more advanced bits.

Here's what students who took the class back in September said:

  • "The exercises were awesome! I really love how they were set up as test cases--there is really no other way to give this much feedback!" - wu
  • "Using the test framework to drive the exercises was brilliant, providing feedback and building confidence. The exercises weren't too difficult, and the detailed, step-by-step instructions helped to make them friendly." - Ken O

You can sign up and pay for the class at the Frozen Perl site. I'd also encourage you to join us for the workshop the next day. There's a good slate of presentations scheduled, and it should be a lot of fun.

Finally, on Sunday, February 7, we'll be having a hackathon. See the Frozen Perl site for details.

Project Stack Push/Pop

| 4 Comments | No TrackBacks

I have an amazing ability to get distracted from my goals when programming. Sometimes it feels like each project I work on is just the latest distraction from what I was working on. Usually this happens because I'm happily hacking away on project A until I hit a roadblock. That roadblock might be a missing feature in a module I'm using, or maybe a module I need that doesn't exist. Sometimes the roadblock is a gap in my understanding. I don't know how to do what I want in a satisfactory way, so I need to learn more about a tool, or just experiment with ways to approach the problem.

I push a new project onto the stack and off I go. I don't know how deep the stack is now. There's probably items that were on ther long ago that have already been forgotten.

Here's an example of where I am in my stack right now:

  • Working on [VegGuide](http://www.vegguide.org) and other things, I've become thoroughly sick of Alzabo ...

    • so I play with DBIx::Class but it doesn't grab me ...

    • I write Fey::ORM ...

  • and back to VegGuide, but I really don't like a lot of things about it, I need to explore new ways of writing webapps ...

    • I start working on a new webapp, a donor/volunteer management app for nonprofits. By building an app from scratch I can get a better understanding of how I want to write moderm webapps. But this type of app is rather complicated so ...

      • having been unhappy with MojoMojo, I start working on a wiki designed for non-geeks (target audience, my animal rights group). I'm using Markdown as the wikitext language, but the existing Markdown tools in Perl don't do what I want so ...

      • now I'm running into major issues with HTML::WikiConverter, which I use to turn GUI-generated HTML back to Markdown. The temptation to fix/rewrite is strong ...

        • sigh

Of course, it's not really quite as simple as this might imply. It's not like the only reason for working on a donor management application is to explore webapps. It's useful all on its own.

Even scarier is the fact that there are other unrelated projects that keep trying to intrude, like making ACT run on mod_perl2 so I can upgrade my server from Dapper to Hardy. I've managed to put that one off for a while, at least, but it keeps nagging at me.

My capacity for adding projects to my stack is simultaneously impressive and disturbing. There's no problem so compelling that it can't be superceded by a new problem uncovered in the course of solving the original.

What's the Point of Markdent?

| 7 Comments | No TrackBacks

Markdent is my new event-driven Markdown parser toolkit, but why should you care?

First, let's talk about Markdown. Markdown is yet another wiki-esque format for marking up plain text. What makes Markdown stand out is it's emphasis on usability and "natural" usage. It's syntax is based on things people have been doing to "mark up" plain text email for years.

For example, if you wanted to list some items in a plain text email, you'd wite something like:

* List item 1
* List item 2
* List item 3

Well, this is how it works in Markdown too. Want to emphasize some text? *Wrap it in asterisks* or _underscores_.

So why do you need an event-driven parser toolkit for dealing with Markdown? CPAN already has several modules for dealing with Markdown, most notably Text::Markdown.

The problem with Text::Markdown is that all you can do with it is generate HTML, but there's so much more you could do with a Markdown document.

If you're using Markdown for an application (like a wiki), you may need to generate slightly different HTML for different users. For example, maybe logged-in users see documents differently.

But what if you want to cache parsing in order to speed things up? If you're going straight from Markdown to HTML, you'd need to cache the resulting HTML for each type of user (or even for each individual user in the worst case).

With Markdent, you can cache an intermediate representation of the document as a stream of events. You can then replay this stream back to the HTML generator as needed.

What's the Impact of Caching?

Here's a benchmark comparing three approaches.

  1. Use Markdent to parse the document and generate HTML from scratch each time.
  2. Use Text::Markdown
  3. Use Markdent to parse the document once, then use Storable to store the event stream. When generating HTML, thaw the event stream and replay it back to the HTML generator.
Rate parse from scratch Text::Markdown replay from captured events
parse from scratch 1.07/s -- -67% -83%
Text::Markdown 3.22/s 202% -- -48%
replay from captured events 6.13/s 475% 91% --

This benchmark is included in the Markdent distro. One feature to note about this benchmark is that it parses 23 documents from the mdtest test suite. Those documents are mostly pretty short.

If I benchmark just the largest document in mdtest, the numbers change a bit:

Rate parse from scratch Text::Markdown replay from captured events
parse from scratch 2.32/s -- -58% -84%
Text::Markdown 5.52/s 138% -- -63%
replay from captured events 14.8/s 538% 168% --

Markdent probably speeds up on large documents because each new parse requires constructing a number of objects. With 23 documents we construct those objects 23 times. When we parse one document the actual speed of parsing becomes more important, as does the speed of not parsing.

What Else?

But there's more to Markdent than caching. One feature that a lot of wikis have is "backlinks", which is a list of pages linking to the current page. With Markdent, you can write a handler that only looks at links. You can use this to capture all the links and generate your backlink list.

How about a full text search engine? Maybe you'd like to give a little more weight to titles than other text. You can write a handler which collects title text and body text separately, then feed that into your full text search tool.

There's a theme here, which is that Markdent makes document analysis much easier.

That's not all you can do. What about a Markdown-to-Textile converter? How about a Markdown-to-Markdown converter for canonicalization?

Because Markdent is modular and pluggable, if you can think of it, you can probably do it.

I haven't even touched on extending the parser itself. That's still a much rougher area, but it's not that hard. The Markdent distro includes an implementation of a dialect called "Theory", based on some Markdown extension proposals by David Wheeler.

This dialect is implemented by subclassing the Standard dialect parser classes, and providing some additional event classes to represent table elements.

I hope that other people will pick up on Markdent and write their own dialects and handlers. Imagine a rich ecosystem of tools for Markdown comparable to what's available for XML or HTML. This would make an already useful markup language even more useful.

Want Good Tools? Break Your Problems Down

| 2 Comments | 1 TrackBack

I've been working a new a project recently, Markdent, an event-driven Markdown parser toolkit.

Why? Because the existing Perl Markdown tools just aren't flexible enough. They bundle up Markdown parsing with HTML conversion all in one API, and I need to do more than convert to HTML.

This sort of inflexibility is quite common when I look at CPAN libraries. Looking back at the Perl DateTime Project, one of my big problems with all the other date/time modules on CPAN was their lack of flexibility. If I could have added good time zone handling to an existing project way back then, I probably would have, but I couldn't, and the Perl DateTime Project was born.

If there is one point I would hammer home to all module authors, it would be "solve small problems". I think that the failure to do this is what leads to the inflexibility and tight coupling I see in so many CPAN distributions.

For example, I imagine that in the date/time world some people thought "I need a bunch of date math functions" or "I need to parse lots of possible date/time strings". Those are good problems to solve, but by going straight there you lose any hope of a good API.

Similarly, with Markdown parsers, I imagine that someone though "I'd like to convert Markdown to HTML", so they wrote a module that does just that.

I can't really fault their goal-focused attitudes. Personally, I sometimes find myself getting lost in digressions. For example, I'm currently writing a webapp with the goal of exploring techniques I want to use in another webapp!

But there's a lot to be said for not going straight to your goal. I'm a big fan of breaking a problem down into smaller pieces and solving each piece separately.

For example, when it comes to Markdown, there are several distinct steps on the way from Markdown to HTML. First, we need to be able to parse Markdown. Parsing Markdown is a step of its own. Then we need to take the results of parsing and turn it into HTML.

If we think of the problem as consisting of these pieces, a clear and flexible design emerges. We need a tool for parsing Markdown (a parser). Separately, we need a tool for converting parse results to HTML (a converter or parse result handler).

Now we need a way to connect these pieces. In the case of Markdent, the connection is an event-driven API where each event is an object and the event receiver conforms to a known API.

It's easy to put these two things together and make a nice simple Markdown-to-HTML converter.

But since I took the time to break the problem down, you can also do other things with this tool. For example, I can do something else with our parse results, like capture all the links or cache the intermediate result of the parsing (an event stream).

And since the HTML generator is a small piece, I can also reuse that. Now that I've cached our event stream, I can pull it from the cache later and use it to generate HTML without re-parsing the document. In the case of Markdent, using a cached parse result to generate HTML was about six times faster in my benchmarks!

Because Markdent has small pieces, there are all sorts of interesting ways to reuse them. How about a Markdown-to-Textile converter? Or how about adding a filter which doesn't allow any raw HTML?

We've all heard that loose coupling makes good APIs. But just saying that doesn't really help you understand how to achieve loose coupling. Loose coupling comes from breaking a big problem down into small independent problems.

As you solve each problem, think about how those solutions will communicate. Design a simple API or communications protocol. You'll know the API is simple enough if you can imagine easily swapping out each piece of the problem with another API-conformant piece. A loosely coupled API is one that makes replacing one end of the API easy.

And best of all, when you break problems down into loosely coupled pieces, you'll make it much easier for others to contribute to and extend your tools. Moose is a great example of this. It's fancy sugar layer exists on top of loosely coupled units known as the metaclass protocol. By separating the sugar from the underlying pieces, we've enabled others to create a huge number of Moose extensions.

The same goes for the Perl DateTime Project. I wrote the core pieces, but there have been many, many great contributions. This wealth of extensions wouldn't be possible without the loosely coupled core pieces and a well-defined API for communicating between components.

I did my Outreach for Animals Week leafleting today, and it went surprisingly well.

I say surprisingly, because I thought that the weather was conspiring against me, but I was wrong. It was raining outside, but it turns out that the University of Minnesota does allow leafleting inside academic buildings (but not inside the student union). Unny suggested I try either Blegen or Willey Hall on the West Bank. I went to Willey near the Gopher Express.

I had thought that traffic would just be too slow for this to be useful, but I was wrong. In fact, compared to my experiences leafleting outside on the UMN campus, this was actually a better location. I think the weather may have driven more people inside, and I picked a spot that was at a good crossroads.

Surprisingly, no one "official" came to tell me I couldn't do this. I had asked Unny to print out a copy of the UMN's policy, which I brought with me. I was sure I'd have to show it to someone, but apparently not.

All in all, I handed out at least 450 leaflets, and maybe more than 500, in just about 2.5 hours.

After doing this, I had a few observations for future leafleting ...

With just one person, you really don't need a high traffic area. I heard a lot more refusals during peak traffic. I'm not sure how this breaks down numerically, but my guess is that I actually gave out fewer leaflets per minute during peak traffic.

The fact that people are receptive during peak traffic makes sense. The busy times where I was located were between classes. Most of those folks are heading to their next class. They don't have free time to think about taking a leaflet. In addition, because I was at the top of the stairway, people probably felt pressure to keep moving rather than block the flow of traffic.

In contrast, the rest of the time was great. The traffic was low, but there were very few totally dead times. Instead, I'd see maybe 1-5 people per minute. This is perfect, since I was able to approach almost every one of them.

Even better, the slower traffic let me approach people in a more relaxed and friendly manner. People seemed most receptive when I greeted them, waited for them to make eye contact and respond, and only then offered the leaflet. I'm no psychologist, but I think the initial exchange of pleasantries probably helps humanize me in their mind, and gives them some sort of investment in our contact.

By contrast, if I said "hi" followed immediately by an offer, or I just offered the leaflet with my usual "information to help animals" phrase, I become just "the leafleter". The recipient hasn't invested anything yet, and they can say no or ignore me easily.

Of course, the downside to this sort of slow but steady traffic is that it really doesn't work well for multiple leafleters. I was joined by person later in the morning, and there really wasn't enough traffic for both of us to be there most of the time, so she ended up going to a different spot.

If I was trying to find a good event for a group, I'd prefer something like leafleting the end of a concert. The traffic is incredibly heavy, and you can actually make use of a decent size group.

Overall, I'm pretty happy with how this went. I'd like to do more outreach like this, but I seem to end up spending all my volunteer time on fundraising, tech, finances, and event planning. Those are rewarding too, but it's nice to go out and do something simple.

I'd really like to thank everyone who sponsored me in this event. You helped raise a good chunk of change for a cause I dearly believe in.

Support Me in a Leaflet-a-thon

| No Comments | No TrackBacks

First off, there's no technical content in this blog post. Sorry.

I'll be participating in a leaflet-a-thon next week with my animal advocacy group, Compassionate Action for Animals. This is like a walkathon, but with less walking and more handing stuff out.

To those within the light of my pixels, if you'd like to support me, you can do so by making a donation online. Even if you don't particularly support the cause, please consider doing this to support me. If you've used a module I've written, you could say thanks by making a donation.

Thanks,

-dave

Recent Comments

  • Dave Rolsky: @Sam: I still don't understand what would drive you to read more
  • Sam Graham: I wanted to apologise for the tone of my previous read more
  • Dave Rolsky: @Sam: If you happen to pick a pathologically slow whitespace read more
  • Sam Graham: Sorry, I'm going to have to call that a failing read more
  • Dave Rolsky: @Sam Graham: My point was that you and Laufrey are read more
  • Sam Graham: Firstly, thanks for the link, secondly, I made just this read more
  • Dave Rolsky: Maybe we can broaden our definition of profiling a bit read more
  • It's contextual!: It isn't so black and white when you combine them. read more
  • Ethan Rowe: Jon Jensen: the snapshot of production notion has some delightful read more
  • Jon Jensen: Over time I've decided it makes no sense to use read more