It’s been a few weeks since my last post on this project. I was distracted by Go reflection and security issues with Perl IP address modules. But now I can get back to my Postgres SQL pretty printer project1.
One of the challenges for this project has been figuring out the best way to test it. I tried a standard unit test approach before giving up and settling on integration testing instead, so in this post I’ll talk about what I tried and what I ended up with.
Series Links
- Part 1: Introduction to the project and generating Rust with Perl
- Part 1.5: More about enum wrappers and Serde’s externally tagged enum representation
- Part 2: How I’m testing the pretty printer and how I generate tests from the Postgres docs
Unit Tests
Whenever possible, I prefer to write unit tests for my code. Let’s define “unit test”, since people use that term all the time, but with slightly different definitions
When I say “unit tests” I’m referring to tests that test the functionality of a single module2, focusing on its public API. If that module integrates with other modules, unit tests may require mocking. Some will say it always requires mocking, but I think that is often a waste of effort, so I avoid mocking in many cases. And though I said “focusing on its public API”, sometimes I will test private functions directly, if they are complex enough to warrant this and doing so is not too painful.
Unit tests are great when you can write them. They let you focus on one small piece of a code base at a time to make sure that it does what you expect. Good unit tests cover the standard use cases, corner cases (zero values, extreme values, etc.), various permutations of argument combinations, and error handling.
If you unit test all of your modules, you know that each module probably does what you want it to. This doesn’t tell you whether they all work together properly, but it’s a good start.
The bulk of the (non-generated) code for pg-pretty lives in one library crate named
formatter
, and initially this
was all in the
crate root’s lib.rs
file,
though I’m working on rewriting this in a branch named new-formatter
(no links because the branch
will be deleted after I merge it to master).
The original formatter code3 has many functions, but the vast majority of them work the same way. They take a struct4 or slice of structs defined in the parser crate’s ast module and return a string representing the formatted SQL for that piece of the AST5.
Here’s a simple function from that code:
|
|
A RangeVar
is a struct representing a name in a FROM
clause. This will always have a relname
(a table, view, or subselect), with optional database and schema names, like
some_db.some_schema.some_table
. In addition, it may have an alias.
This is a relatively simple example, as this type of struct doesn’t contain any complex structs itself. Other functions, like for formatting a select statement, mostly consist of calls to other formatting functions:
|
|
This is an early, incomplete version which doesn’t handle HAVING
clauses, LIMIT
clauses, window
clauses, locking clauses, UNION
queries, etc. The point I’m trying to make is that these functions
get complex quickly. While breaking them down into lots of smaller functions helps, I’ve ended up
with a huge number of small functions.
So how do we test this with unit tests? To do that, we need a way to produce structs from the ast
module like RangeVar
or SelectStmt
. For reference, here’s RangeVar
:
|
|
The RangeVar
struct only refers to one other ast enum or struct, AliasWrapper
, which in turn
contains an Alias
. But the SelectStmt
is much more complex6:
|
|
Besides having many more fields than RangeVar
, most of those fields are other types of ast nodes.
In fact, many of these are a boxed Node
or a List
, which is a Vec<Node>
. One field,
values_lists
, is a Vec<List>
! A Node
is an enum which can be any AST node7.
So how exactly do we produce the various SelectStmt
structs that we’d want to feed into
format_select_stmt
for testing? The structs themselves have no constructors. The parser only
generates them based on the results of parsing, which is done in C code for which there is no public
API. That C code produces a JSON representation of the AST, which we deserialize into structs.
So the only way left to do this is to construct them “by hand” with code like this:
|
|
This isn’t entirely terrible, but as I noted before, the RangeVar
struct is one of the simpler
structs in the AST. The equivalent for a SelectStmt
would be absolutely enormous. Even for a
RangeVar
, constantly having to pass mostly None
for the arguments gets old very fast. To make
this simpler, I made a macro:
|
|
This covers every possible combination of optional arguments, so I could write
range_var!("people")
or range_var!("people" AS "persons")
or
range_var!("some_schema", "people")
and so on.
This made the tests more concise, but the equivalent macro implementation for a SelectStmt
might
be hundreds or even thousands of lines long.
Giving Up on Unit Tests
I quickly realized that this approach simply wouldn’t scale. The structs that the formatter deals with are so complex, there are so many of them, and each struct has so many possible variations of optional fields, types of contained nodes, etc.
With my macro plus function approach, I’d have to write tens of thousands of lines of code in support of these unit tests. And that code would itself be so complex that it would really demand its own test suite!
But that’s not even the biggest problem. The biggest problem is that because these AST structs are
produced by the Postgres parser C code, I really have no idea what all the possibilities for a given
struct are. Given that many structs simply contain Node
or List
structs, the possibilities are
literally limitless.
So in summary:
- Writing struct generation code would be much more work than writing the formatter.
- The struct generation code would be so complex that it’d require its own test suite.
- There’s no good way for me to know what structs to generate for tests without lots of real-world examples.
That last bullet point mentions “lots of real-world examples”. That’s what I needed. So how to get them? I could scour my own projects for examples, though in many cases I use an ORM, so it’s not a trivial matter of just copying some SQL. I could look for projects on GitHub that use Postgres. Maybe I could look on Stack Overflow.
But finally, I had a good idea8. The Postgres documentation is quite extensive, and it includes many SQL examples! If I could extract those examples then those examples could form the basis of a test suite.
Integration Tests to the Rescue
Taking a step back, I realized that what I really wanted to test was the formatter as a whole. While unit tests are great, I could greatly simplify my test code by simply comparing SQL statement input to SQL statement output. Any time my output diverged from what I expected, that’s a bug in the formatter (or in my expectations). And if the formatter panics because it can’t handle a particular node9, that’s a missing piece of the implementation.
This was yet another case where Perl came in handy. I wrote a quick script to parse the entire
documentation tree10 and find programlisting
elements which contained SQL. These are then put
into files named after the doc file that contained them, giving me files with content like this:
|
|
The first section, 1
, is a test description, to be filled in later by me. The second section is
the input, and the third, ???
, is the expected output. These generated files are useful seeds for
test cases, though I have to fill in the test name and expected output. An example from an actual
test looks like this:
|
|
(I will be improving the formatting, because I don’t like the output the original version gives!)
These files are easy to edit, and adding new tests by hand is easy as well.
The test harness simply reads each file, splits them up into individual cases, and runs the input through the formatter and compares it to the output.
In order to make failures more understandable, I use
prettydiff
’s
diff_lines
function to
compare the expected output to what I actually got. This is quite helpful, but in cases where they
differ by whitespace, especially trailing newlines, it’s not as helpful as I’d like.
So I also added some optional debugging output (based on an env var) that shows me each escaped
character in the expected versus actual input. This lets me easily see whitespace differences, and
newlines are printed as \n
, which makes extra newlines obvious as well.
In Summary
Sometimes integration tests are better than unit tests. In this case, focusing on integration tests freed me from a testing morass I was stuck in, letting me focus on the formatting code. The fact that I can generate a huge number of integration tests gives me some confidence that this approach will work.
I definitely won’t get 100% test coverage (which is basically impossible given the recursive nature
of the fact that an AST Node
s can contain another Node
). But I think I can use this approach to
product a decent first release. Once people start using it, I will quickly get bug reports with more
test cases.
Next up
Here’s a list of what I want to cover in future posts.
- Diving into the Postgres grammar to understand the AST.
- The benefits of Rust pattern-matching for working with ASTs.
- How terrible my initial solution to generating SQL in the pretty printer is, and how I fixed it (once I actually fix it).
- Who knows what else?
Try saying that three times fast. ↩︎
I’m referring to Rust modules here. Substitute package, library, etc. for your language of choice. ↩︎
The examples are from commit
71b6e24
. ↩︎These are the structs that I wrote about generating in Part 1. ↩︎
Having each function return a string directly was the wrong approach. In my new branch I’m having these functions return something else that is then formatted later. But more on that in a future blog post once I’ve solidified this new approach. ↩︎
You don’t need to read the entire struct definition. Just take note that this struct has a lot of fields and move on. ↩︎
In practice, these fields cannot really contain any node. For example, the
where_clause
field, which is aBox<Node>
, is not going to contain anInsertStmt
orCreateDomainStmt
or anIntoClause
. But this is what the underlying Postgres data structures use, and I have no easy way of knowing which subset of nodes aNode
-typed field will contain. In some cases, I’ve actually figured this out through reading the Postgres parser code and my generator code overrides the field definition to use a simpler type. I plan to write about how I do this in a future blog post. ↩︎This happens every once in a while. It’s always very exciting. ↩︎
There are so many node types that I didn’t even try to list them all in the initial
format_node
implementation’smatch
. Instead, I had a default match (_
) that just returned an error saying that formatting for the given node type wasn’t implemented. ↩︎The documentation files all have an
sgml
extension so I used an SGML parser, but just now I looked more closely and I think it’s mostly XML, but without a DTD in most files. Regardless, I ended up using an SGML parser in my Perl code. ↩︎