Friday, September 07, 2012

A Go Gotcha that Got Me

Here's a little Go program that has a surprising output:
package main

import (
func main() {
 var c net.Conn

 c, err := tls.Dial("tcp", "", nil)

 fmt.Printf("%v %v\n", c, err)

 if c == nil {
 } else {
  fmt.Printf("Not nil\n")
This program tries to connect to my web site using TLS on the non-TLS port 80. That's done to force there to be a TLS error. The output is a little surprising:
  <nil> local error: record overflow
   Not nil
The Printf gives the value of c as but when the test c == nil is performed it's non-nil. 

So, what's going on? 

The answer is in the Go FAQ: Why is my nil error value not equal to nil?

In short, c is an interface (a net.Conn). The implementation of an interface is a type and a value. The type gives the actual type that implements that interface (in the case above it's a *tls.Conn) and the value is a pointer to the concrete example of that type. 

When the error occurs in the code above tls.Dial returns nil, err. The nil is a nil pointer to a tls.Conn. When the assignment to c happens c becomes non-nil (its type is now *tls.Conn) but its value is nil (since tls.Dial returned a nil pointer). Thus the nil test fails. 

The bottom line is that my code above is the wrong thing to do. Don't do nil tests on interface variables.

PS A lot of people have been asking me why I wasn't checking the value of err. In the real code I was, but a defer statement was acting on c and in it I had the code

if c != nil {

In figuring out why that failed sometimes I discovered the gotcha.


Yves Junqueira said...

Most importantly, check the error result first before doing anything with the value.

Yves Junqueira said...

Most importantly, check the error result first before doing anything with the value.

rog peppe said...

Don't do nil tests on interface variables.

I'm not sure that's the right message to take home from this experience. Doing nil tests on interface variables is just fine. The thing to be aware of is that when you assign a non-interface value to an interface value, the interface value will never be nil.

This issue is why functions always return an error value of type "error" even when the type is known. The particular case you encountered is particularly easy to trip over because the signature of tls.Dial is almost exactly the same as that of net.Dial.

The particular pattern to watch for is:

x, err = something()

Note the "=" rather than :=.

It might be nice if there was a go vet check for this pattern actually. When go/types is ready, there may well be.