Edit 2021-03-30: Jeremy Mikkola wrote about some closely related topics back in 2017.
Edit 2021-03-31: Chris Siebenmann wrote a response to this
post
that explains exactly how interface values that are nil
are typed. It’s more
complicated than I thought!
I’m not sure I have another Rust & Postgres blog post in me right now, so let’s learn something about Go instead.
Recently I decided I wanted to add a --unique
flag to
omegasort. Wait, what’s
omegasort?
It’s a text file sorting tool that supports lots of different sorting methods. For example, in addition a standard text sort, it can sort numbered lines, date-prefixed lines, paths (including Windows paths with and without drive letters), IP addresses, and IP networks. It also supports Unicode locales, reverse sorting, and locale-aware case insensitive sorting.
I use it together with precious
to sort things like .gitignore
files, spellchecker allowlists, and things of
that nature.
I realized that I really wanted a --unique
flag for all of this. While I
could just pipe its output to uniq
on a *nix system, this doesn’t work so
well on Windows. Plus with tools like precious it’s easier if I can use one
binary for a given task. If I want to pipe things I have to put that in a
shell script that precious calls.
But my rabbit hole experience didn’t happen with omegasort directly. Instead, it happened when I tried to add some integration tests.
While writing those integration tests, I was using
github.com/houseabsolute/detest
. This
is a Golang package I created that offers a test assertion interface inspired
by Test2-Suite
in Perl.
For reference, here’s a Test2-Suite
example:
|
|
I think this is pretty self-explanatory, except for T()
, which means “true”.
And here’s something like that in Go with detest
:
|
|
It’s not as nice as the Perl version because it gets quite verbose, but this was the closest I could come. Go’s type system, combined with a lack of syntactic flexibility, means a whole lot of func calls, braces, and parens.
Under the hood, this is implemented with a metric fork ton of runtime
reflection using the stdlib’s reflect
package. I don’t love this, but absent generics,
there’s no other way to implement this sort of API except with code
generation. And that codegen would have to be fed by a sort-of-Go language
that was translated to real Go, which seems like a terrible idea.
Getting to the Darn Point
So while I was writing those omegasort integration tests using detest, I managed to find a whole lot of bugs in detest.
But the title says nil
and I haven’t mentioned those yet.
So here’s a fun fact, Go has multiple “types” of nil
. Specifically, there
are both typed and untyped nil
variables. This surprised me at first, but it
makes sense when you think about it.
Let’s take this code1:
|
|
This prints out the following:
|
|
So a bare nil
and a variable that has a type but no value are equal, but if
you try to get a reflect.Value
for nil
, it’s not valid. If you try to call
other methods like v.IsNil()
or v.Type()
on an invalid2 reflect.Value
,
you will get a panic.
I encountered this when trying to test that an error
returned by a func call
was nil
.
This led to a flurry of detest
releases as I realized how
many parts of the detest
code this impacted. In most places where it uses
reflect
, I have to guard against a bare nil
being passed in.
But wait, it gets even more confusing. Sometimes the Go compiler will turn an
untyped nil
into a typed nil
. Here’s an example3:
|
|
And when we run it we get this:
|
|
So when I pass a bare nil
to takesSlice
, it gets typed as whatever type
the function’s signature says it should be.
But wait, it gets even more confusing yet again! Sometimes the Go compiler
won’t turn an untyped nil
into a typed nil
. Here’s an example4:
|
|
If the type of the argument in the function signature is any type of
interface, including interface{}
, then the underlying value is still untyped
and not valid. This … sort of makes sense? I think the way this works is
that anything typed as an interface also has a real underlying type. So an
error
can be an errors.errorString
or an exec.ExitError
or a
mypackage.DogError
. But if we pass a bare nil
or an uninitialized
variable, there’s no underlying type.
This came up with detest when I wanted to test that I didn’t get an error from a call.
|
|
Under the hood, the signature for d.Is()
uses interface{}
for the two
arguments being compared. So bare nil
as the second argument will never be
valid. And the first argument might be valid or it might not be. If
doThing()
’s return type is just error
and it returns a nil
, then the
value in err
has no type.
All of this led to a fair bit more code in the detest
guts to handle
this. For example, just because two variables don’t have the same type doesn’t
mean they’re not equal (from Go’s perspective). A bare nil
and an
uninitialized slice are equal when compared with ==
, which is what d.Is()
emulates using reflect
.
So there’s quite a few cases around one or both arguments being invalid that
need handling. And there are MANY other methods with the same issues to
consider, including things like d.Map()
and d.Struct()
, all of which
should handle an invalid value properly.
What Does This Look Like in Other Languages?
Well, I don’t know that many other languages. In Perl this isn’t really a
thing, because it has a pretty minimal type system. Perl’s undef
can be
coerced to lots of things, although under
strict trying to use an undef
in certain
ways is an error, like writing this:
|
|
This will blow up with Can't use an undefined value as an ARRAY reference ...
at line 2.
Rust (at least safe Rust5) doesn’t have any notion of nil
or undefined
values. Instead, you have the
Option<T>
type,
which always has a type. For example6:
|
|
This just won’t compile. While both a
and b
are None
, they’re not the
same type of None
so you can’t just compare them with ==
. The compiler
says:
|
|
By the way, aren’t these Rust compiler errors nice? The only other language I’ve seen with this type of extremely detailed compiler errors is Raku.
In Summary
It’s tempting to pick on Go and complain about it. I certainly do that a lot
at work. But to be fair, this really isn’t an issue for most Go code. It’s
only because I’m trying to do weird stuff with reflect
that I’m learning
about this internal weirdness. In day to day Go code, the compiler’s handling
of various types of nil
“just works” the way you’d expect it to. And being
able to use a bare nil
is quite handy.
But I still prefer how Rust does it, using a parameterized Option<T>
type. That way I can easily check if something is None
without any special
cases. Everything is using the same type system, though that type system is
much more complex than Go’s.
Note that an “invalid” value in the context of
reflect
is not invalid in the context of a Go program. You can use an invalid value everywhere you can use the corresponding valid but uninitializednil
value. ↩︎I know very little about unsafe Rust which is why I’m hedging. ↩︎
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=677599a2ff660f57b51a31219f428312 ↩︎