Edit 2021-03-30: Jeremy Mikkola wrote about some closely related topics back in 2017.
Edit 2021-03-31: Chris Siebenmann
wrote a response to this post
that explains exactly how interface values that are nil
are typed. It’s more complicated than I
thought!
I’m not sure I have another Rust & Postgres blog post in me right now, so let’s learn something about Go instead.
Recently I decided I wanted to add a --unique
flag to
omegasort. Wait, what’s omegasort?
It’s a text file sorting tool that supports lots of different sorting methods. For example, in addition a standard text sort, it can sort numbered lines, date-prefixed lines, paths (including Windows paths with and without drive letters), IP addresses, and IP networks. It also supports Unicode locales, reverse sorting, and locale-aware case insensitive sorting.
I use it together with precious to sort things like
.gitignore
files, spellchecker allowlists, and things of that nature.
I realized that I really wanted a --unique
flag for all of this. While I could just pipe its
output to uniq
on a *nix system, this doesn’t work so well on Windows. Plus with tools like
precious it’s easier if I can use one binary for a given task. If I want to pipe things I have to
put that in a shell script that precious calls.
But my rabbit hole experience didn’t happen with omegasort directly. Instead, it happened when I tried to add some integration tests.
While writing those integration tests, I was using
github.com/houseabsolute/detest
. This is a Golang
package I created that offers a test assertion interface inspired by
Test2-Suite
in Perl.
For reference, here’s a Test2-Suite
example:
|
|
I think this is pretty self-explanatory, except for T()
, which means “true”.
And here’s something like that in Go with detest
:
|
|
It’s not as nice as the Perl version because it gets quite verbose, but this was the closest I could come. Go’s type system, combined with a lack of syntactic flexibility, means a whole lot of func calls, braces, and parens.
Under the hood, this is implemented with a metric fork ton of runtime reflection using the stdlib’s
reflect
package. I don’t love this, but absent generics, there’s no
other way to implement this sort of API except with code generation. And that codegen would have to
be fed by a sort-of-Go language that was translated to real Go, which seems like a terrible idea.
Getting to the Darn Point
So while I was writing those omegasort integration tests using detest, I managed to find a whole lot of bugs in detest.
But the title says nil
and I haven’t mentioned those yet.
So here’s a fun fact, Go has multiple “types” of nil
. Specifically, there are both typed and
untyped nil
variables. This surprised me at first, but it makes sense when you think about it.
Let’s take this code1:
|
|
This prints out the following:
|
|
So a bare nil
and a variable that has a type but no value are equal, but if you try to get a
reflect.Value
for nil
, it’s not valid. If you try to call other methods like v.IsNil()
or
v.Type()
on an invalid2 reflect.Value
, you will get a panic.
I encountered this when trying to test that an error
returned by a func call was nil
.
This led to a flurry of detest
releases as I
realized how many parts of the detest
code this impacted. In most places where it uses reflect
,
I have to guard against a bare nil
being passed in.
But wait, it gets even more confusing. Sometimes the Go compiler will turn an untyped nil
into a
typed nil
. Here’s an example3:
|
|
And when we run it we get this:
|
|
So when I pass a bare nil
to takesSlice
, it gets typed as whatever type the function’s signature
says it should be.
But wait, it gets even more confusing yet again! Sometimes the Go compiler won’t turn an untyped
nil
into a typed nil
. Here’s an example4:
|
|
If the type of the argument in the function signature is any type of interface, including
interface{}
, then the underlying value is still untyped and not valid. This … sort of makes
sense? I think the way this works is that anything typed as an interface also has a real
underlying type. So an error
can be an errors.errorString
or an exec.ExitError
or a
mypackage.DogError
. But if we pass a bare nil
or an uninitialized variable, there’s no
underlying type.
This came up with detest when I wanted to test that I didn’t get an error from a call.
|
|
Under the hood, the signature for d.Is()
uses interface{}
for the two arguments being compared.
So bare nil
as the second argument will never be valid. And the first argument might be valid or
it might not be. If doThing()
’s return type is just error
and it returns a nil
, then the value
in err
has no type.
All of this led to a fair bit more code in the detest
guts to handle this. For example, just
because two variables don’t have the same type doesn’t mean they’re not equal (from Go’s
perspective). A bare nil
and an uninitialized slice are equal when compared with ==
, which is
what d.Is()
emulates using reflect
.
So there’s quite a few cases around one or both arguments being invalid that need handling. And
there are MANY other methods with the same issues to consider, including things like d.Map()
and
d.Struct()
, all of which should handle an invalid value properly.
What Does This Look Like in Other Languages?
Well, I don’t know that many other languages. In Perl this isn’t really a thing, because it has a
pretty minimal type system. Perl’s undef
can be coerced to lots of things, although under
strict trying to use an undef
in certain ways is an error, like
writing this:
|
|
This will blow up with Can't use an undefined value as an ARRAY reference ...
at line 2.
Rust (at least safe Rust5) doesn’t have any notion of nil
or undefined values. Instead, you
have the Option<T>
type, which always has
a type. For example6:
|
|
This just won’t compile. While both a
and b
are None
, they’re not the same type of None
so
you can’t just compare them with ==
. The compiler says:
|
|
By the way, aren’t these Rust compiler errors nice? The only other language I’ve seen with this type of extremely detailed compiler errors is Raku.
In Summary
It’s tempting to pick on Go and complain about it. I certainly do that a lot at work. But to be
fair, this really isn’t an issue for most Go code. It’s only because I’m trying to do weird stuff
with reflect
that I’m learning about this internal weirdness. In day to day Go code, the
compiler’s handling of various types of nil
“just works” the way you’d expect it to. And being
able to use a bare nil
is quite handy.
But I still prefer how Rust does it, using a parameterized Option<T>
type. That way I can easily
check if something is None
without any special cases. Everything is using the same type system,
though that type system is much more complex than Go’s.
Note that an “invalid” value in the context of
reflect
is not invalid in the context of a Go program. You can use an invalid value everywhere you can use the corresponding valid but uninitializednil
value. ↩︎I know very little about unsafe Rust which is why I’m hedging. ↩︎
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=677599a2ff660f57b51a31219f428312 ↩︎